--- title: "Modifying MixSIAR plots" output: html_vignette vignette: > %\VignetteIndexEntry{Modifying MixSIAR plots} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r include=F, echo=F} knitr::opts_chunk$set(fig.width=7, fig.height=5) ``` MixSIAR generates lots of output from a fit model. Up to version 3.1.12, by default these were printed to the R console with options to suppress this behavior and save png/pdf/txt files instead. Modifying these plots is necessary to make publication/presentation-quality figures. In version 3.1.13, we added the capability to return the `ggplot2` objects in order to make this process easier. As suggested in [#235](https://github.com/brianstock/MixSIAR/issues/235), `output_JAGS` has been split into separate functions. To maintain backward compatibility, the original `output_JAGS` function remains untouched and will continue to behave as before. However, a new argument to the `output_options` list is used (`output_options$return_obj = TRUE`) to return objects from the new functions: + `output_diagnostics`: named list of three data frames (one each for Gelman, Heidelberg-Welch, and Geweke) + `output_stats`: data frame of summary statistics + `output_posteriors`: named nested list of `ggplot` objects - `global`: overall/global proportions - `fac1`: factor 1 (if in model) - `fac2`: factor 2 (if in model) - `both`: if model with 2 fixed effects or 1 fixed effect + 1 random effect (cannot plot one without the other) - `cont`: continuous effect (if in model) - `sig`: random effect variance terms (if model has random effects) - `epsilon`: multiplicative error term (if model uses "Residual * Process" error) Below, we demonstrate how to modify output from the [Wolves](https://brianstock.github.io/MixSIAR/articles/wolves_ex.html) and [Alligator](https://github.com/brianstock/MixSIAR/blob/master/inst/example_scripts/mixsiar_script_alligator.R) examples. If you have not already done so, see these vignettes first for more commentary and explanation. ## Install MixSIAR from Github The latest changes are not yet on CRAN. Install the GitHub version: ```{r, eval=FALSE} install.packages("devtools") remotes::install_github("brianstock/MixSIAR", dependencies=T) ``` ## Run wolves example Run the wolves example with: ```{r, eval=FALSE} library(MixSIAR) mixsiar.dir <- find.package("MixSIAR") source(file.path(mixsiar.dir,"example_scripts","mixsiar_script_wolves_normal.R")) ``` ```{r setup, include=FALSE, echo=FALSE} library(MixSIAR) load(url("https://github.com/brianstock/MixSIAR/blob/master/Manual/wolves_normal.RData?raw=true")) ``` ## Set output options to return `ggplot` objects ```{r} output_options <- list(summary_save = TRUE, summary_name = "summary_statistics", sup_post = TRUE, plot_post_save_pdf = FALSE, plot_post_name = "posterior_density", sup_pairs = TRUE, plot_pairs_save_pdf = TRUE, plot_pairs_name = "pairs_plot", sup_xy = TRUE, plot_xy_save_pdf = TRUE, plot_xy_name = "xy_plot", gelman = TRUE, heidel = FALSE, geweke = TRUE, diag_save = TRUE, diag_name = "diagnostics", indiv_effect = FALSE, plot_post_save_png = FALSE, plot_pairs_save_png = FALSE, plot_xy_save_png = FALSE, diag_save_ggmcmc = FALSE, return_obj = TRUE) ``` ## Diagnostics The diagnostics output can be saved as a list of data frames: ```{r} diag <- output_diagnostics(jags.1, mix, source, output_options) ``` There is one data frame for each of: Gelman-Rubin, Geweke, and Heidelberger-Welch. ```{r} names(diag) ``` ```{r} head(diag$gelman) ``` ```{r} head(diag$geweke) ``` ## Summary statistics The summary statistics can be saved as a data frame. ```{r} df.stats <- output_stats(jags.1, mix, source, output_options) ``` You can access individual stats using the rownames. ```{r} rownames(df.stats) ``` For example, look at the Salmon diet proportion for Pack 4 wolves: ```{r} df.stats[rownames(df.stats) == "p.Pack 4.Salmon",] ``` For example, get the 95% CI for the Deer diet proportion for Region 1 wolves: ```{r} df.stats[rownames(df.stats) == "p.Region 1.Deer",c("2.5%","97.5%")] ``` Note that you can also do the same from the MCMC draws directly. `p.fac1` is the diet proportion by Region (factor 1), indexed as `[MCMC chain, Region, Source]`. See that these match the stats above. ```{r} source$source_names # confirm that Deer = source 1 quantile(jags.1$BUGSoutput$sims.list$p.fac1[,1,1], probs=c(.025,.975)) ``` Calculate the probability that the Region 2 Deer diet proportion is greater than 0.7. ```{r} # Total num draws tot <- length(jags.1$BUGSoutput$sims.list$p.fac1[,2,1]) # Num draws above 0.7 above <- length(which(jags.1$BUGSoutput$sims.list$p.fac1[,2,1] > 0.7)) # Prob that the diet proportion is above 70% (prob <- above/tot) ``` Or maybe we want the probability that Pack 4 eats more Deer than Pack 7: ```{r} df.stats[rownames(df.stats) %in% c("p.Pack 4.Deer","p.Pack 7.Deer"),] ``` ```{r} (prob.Deer.Pack4.Pack7 <- sum(jags.1$BUGSoutput$sims.list$p.fac2[,4,1] > jags.1$BUGSoutput$sims.list$p.fac2[,7,1])/tot) ``` We can also get a complete posterior probability for the *difference* between Pack 4 and Pack 7 (i.e. is Pack4 - Pack7 greater than 0?) ```{r} p.Deer.Pack4.Pack7 <- jags.1$BUGSoutput$sims.list$p.fac2[,4,1] - jags.1$BUGSoutput$sims.list$p.fac2[,7,1] hist(p.Deer.Pack4.Pack7,breaks=50,col="grey", main="Difference between Deer proportions, Pack 4 - Pack 7") abline(v=0,col="red",lty=2,lwd=3) ``` ## Posterior density plots We can access the posterior density plots for later modification since we set `output_options$return_obj = TRUE`. ```{r} g.post <- output_posteriors(jags.1, mix, source, output_options) ``` `g.post` is a named nested list of `ggplot` objects - `global`: overall/global proportions - `fac1`: factor 1 (if in model) - `fac2`: factor 2 (if in model) - `both`: if model with 2 fixed effects or 1 fixed effect + 1 random effect (cannot plot one without the other) - `cont`: continuous effect (if in model) - `sig`: random effect variance terms (if model has random effects) - `epsilon`: multiplicative error term (if model uses "Residual * Process" error) ```{r} names(g.post) ``` The default `p.global` posterior density plot looks ok. ```{r} g.post$global ``` The Region plots are in the nested list `g.post$fac1` (likewise for Pack plots, in `g.post$fac2`). Plot Region 1: ```{r} g.post$fac1[[1]] ``` Plot Pack 5 with a different color palette. ```{r} # note the 'ggplot2::' is only necessary for building this vignette # in your code you can simply load ggplot2 with library(ggplot2) g.post$fac2[[5]] + ggplot2::scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9")) + ggplot2::scale_color_manual(values=c("#999999", "#E69F00", "#56B4E9")) ``` ## Continuous effect The [Alligator Example](https://github.com/brianstock/MixSIAR/blob/master/inst/example_scripts/mixsiar_script_alligator.R) has a continuous effect. It takes awhile to run and we only use model 5 (Length) below, so instead of running the entire example you can run only model 5. If you want to run all 8 models (full example), it's a good idea to save the results for later. ```{r eval=F} rm(list=ls()) # clear wolves ex objects source(file.path(mixsiar.dir,"example_scripts","mixsiar_script_alligator.R")) # run all 8 models save.image("where/to/save/output/alligator_short.RData") # specify path, where to save file ``` First change `output_options` to return the `ggplot` objects, then get the posterior density plots ```{r eval=F} output_options$return_obj = TRUE g.post <- output_posteriors(jags.mod[[5]], mix[[5]], source[[5]], output_options) ``` In this case there is no `g.post$fac1`, `g.post$fac2`, or `g.post$sig` because there are no fixed/random effects. There is a `g.post$cont`, which holds the plots for the continuous effect, Length. ```{r eval=F} names(g.post) ``` `g.post$cont` has 4 plots, each of which can be modified: - proportion vs. continuous variable (Length) - proportions of the min(Length) alligator - proportions of the median(Length) alligator - proportions of the max(Length) alligator ```{r eval=F} g.post$cont[[1]] + ggplot2::theme(legend.position="right") ``` ```{r eval=F} g.post$cont[[2]] + ggplot2::scale_fill_grey() ``` ```{r eval=F} g.post$cont[[3]] ``` ```{r eval=F} g.post$cont[[4]] ``` The `plot_continuous_var` function has a couple other options you can modify: - `alphaCI`: alpha level to use for credible interval width (default = 0.05, 95% CI) - `exclude_sources_below`: proportion threshhold to include sources in the plot. Some sources can be estimated at very low proportions, and this makes the plot less clear. Setting `exclude_sources_below = 0.1` will remove sources with *median* proportion less than 10% for *all values of the continuous variable*. ```{r eval=F} g.cont <- plot_continuous_var(jags.1, mix, source, output_options, alphaCI=0.05, exclude_sources_below=0.1) g.cont ``` Compare to plot using 80% CI: ```{r eval=F} g.cont80 <- plot_continuous_var(jags.1, mix, source, output_options, alphaCI=0.2) g.cont80 ```