---
title: "Modifying MixSIAR plots"
output: html_vignette
vignette: >
  %\VignetteIndexEntry{Modifying MixSIAR plots}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc} 
---

```{r include=F, echo=F}
knitr::opts_chunk$set(fig.width=7, fig.height=5) 
```

MixSIAR generates lots of output from a fit model. Up to version 3.1.12, by default these were printed to the R console with options to suppress this behavior and save png/pdf/txt files instead. Modifying these plots is necessary to make publication/presentation-quality figures. In version 3.1.13, we added the capability to return the `ggplot2` objects in order to make this process easier.

As suggested in [#235](https://github.com/brianstock/MixSIAR/issues/235), `output_JAGS` has been split into separate functions. To maintain backward compatibility, the original `output_JAGS` function remains untouched and will continue to behave as before. However, a new argument to the `output_options` list is used (`output_options$return_obj = TRUE`) to return objects from the new functions:

+ `output_diagnostics`: named list of three data frames (one each for Gelman, Heidelberg-Welch, and Geweke)
+ `output_stats`: data frame of summary statistics
+ `output_posteriors`: named nested list of `ggplot` objects

  - `global`: overall/global proportions
  - `fac1`: factor 1 (if in model)
  - `fac2`: factor 2 (if in model)
  - `both`: if model with 2 fixed effects or 1 fixed effect + 1 random effect (cannot plot one without the other)
  - `cont`: continuous effect (if in model)
  - `sig`: random effect variance terms (if model has random effects)
  - `epsilon`: multiplicative error term (if model uses "Residual * Process" error)

Below, we demonstrate how to modify output from the [Wolves](https://brianstock.github.io/MixSIAR/articles/wolves_ex.html) and [Alligator](https://github.com/brianstock/MixSIAR/blob/master/inst/example_scripts/mixsiar_script_alligator.R) examples. If you have not already done so, see these vignettes first for more commentary and explanation.

## Install MixSIAR from Github

The latest changes are not yet on CRAN. Install the GitHub version:
```{r, eval=FALSE}
install.packages("devtools")
remotes::install_github("brianstock/MixSIAR", dependencies=T)
```

## Run wolves example

Run the wolves example with:
```{r, eval=FALSE}
library(MixSIAR)
mixsiar.dir <- find.package("MixSIAR")
source(file.path(mixsiar.dir,"example_scripts","mixsiar_script_wolves_normal.R"))
```

```{r setup, include=FALSE, echo=FALSE}
library(MixSIAR)
load(url("https://github.com/brianstock/MixSIAR/blob/master/Manual/wolves_normal.RData?raw=true"))
```

## Set output options to return `ggplot` objects

```{r}
output_options <- list(summary_save = TRUE,                 
                       summary_name = "summary_statistics", 
                       sup_post = TRUE,                    
                       plot_post_save_pdf = FALSE,           
                       plot_post_name = "posterior_density",
                       sup_pairs = TRUE,             
                       plot_pairs_save_pdf = TRUE,    
                       plot_pairs_name = "pairs_plot",
                       sup_xy = TRUE,           
                       plot_xy_save_pdf = TRUE,
                       plot_xy_name = "xy_plot",
                       gelman = TRUE,
                       heidel = FALSE,  
                       geweke = TRUE,   
                       diag_save = TRUE,
                       diag_name = "diagnostics",
                       indiv_effect = FALSE,       
                       plot_post_save_png = FALSE, 
                       plot_pairs_save_png = FALSE,
                       plot_xy_save_png = FALSE,
                       diag_save_ggmcmc = FALSE,
                       return_obj = TRUE)

```

## Diagnostics

The diagnostics output can be saved as a list of data frames:

```{r}
diag <- output_diagnostics(jags.1, mix, source, output_options)
```

There is one data frame for each of: Gelman-Rubin, Geweke, and Heidelberger-Welch.

```{r}
names(diag)
```

```{r}
head(diag$gelman)
```

```{r}
head(diag$geweke)
```

## Summary statistics

The summary statistics can be saved as a data frame.

```{r}
df.stats <- output_stats(jags.1, mix, source, output_options)
```

You can access individual stats using the rownames.

```{r}
rownames(df.stats)
```

For example, look at the Salmon diet proportion for Pack 4 wolves:
```{r}
df.stats[rownames(df.stats) == "p.Pack 4.Salmon",]
```

For example, get the 95% CI for the Deer diet proportion for Region 1 wolves:
```{r}
df.stats[rownames(df.stats) == "p.Region 1.Deer",c("2.5%","97.5%")]
```

Note that you can also do the same from the MCMC draws directly. `p.fac1` is the diet proportion by Region (factor 1), indexed as `[MCMC chain, Region, Source]`. See that these match the stats above.
```{r}
source$source_names # confirm that Deer = source 1
quantile(jags.1$BUGSoutput$sims.list$p.fac1[,1,1], probs=c(.025,.975))
```

Calculate the probability that the Region 2 Deer diet proportion is greater than 0.7.

```{r}
# Total num draws
tot <- length(jags.1$BUGSoutput$sims.list$p.fac1[,2,1])
# Num draws above 0.7
above <- length(which(jags.1$BUGSoutput$sims.list$p.fac1[,2,1] > 0.7))
# Prob that the diet proportion is above 70%
(prob <- above/tot)
```

Or maybe we want the probability that Pack 4 eats more Deer than Pack 7:

```{r}
df.stats[rownames(df.stats) %in% c("p.Pack 4.Deer","p.Pack 7.Deer"),]
```

```{r}
(prob.Deer.Pack4.Pack7 <- sum(jags.1$BUGSoutput$sims.list$p.fac2[,4,1] > jags.1$BUGSoutput$sims.list$p.fac2[,7,1])/tot)
```

We can also get a complete posterior probability for the *difference* between Pack 4 and Pack 7 (i.e. is Pack4 - Pack7 greater than 0?)

```{r}
p.Deer.Pack4.Pack7 <- jags.1$BUGSoutput$sims.list$p.fac2[,4,1] - jags.1$BUGSoutput$sims.list$p.fac2[,7,1]
hist(p.Deer.Pack4.Pack7,breaks=50,col="grey", main="Difference between Deer proportions, Pack 4 - Pack 7")
abline(v=0,col="red",lty=2,lwd=3)
```

## Posterior density plots

We can access the posterior density plots for later modification since we set `output_options$return_obj = TRUE`.

```{r}
g.post <- output_posteriors(jags.1, mix, source, output_options)
```

`g.post` is a named nested list of `ggplot` objects

  - `global`: overall/global proportions
  - `fac1`: factor 1 (if in model)
  - `fac2`: factor 2 (if in model)
  - `both`: if model with 2 fixed effects or 1 fixed effect + 1 random effect (cannot plot one without the other)
  - `cont`: continuous effect (if in model)
  - `sig`: random effect variance terms (if model has random effects)
  - `epsilon`: multiplicative error term (if model uses "Residual * Process" error)

```{r}
names(g.post)
```

The default `p.global` posterior density plot looks ok.

```{r}
g.post$global
```

The Region plots are in the nested list `g.post$fac1` (likewise for Pack plots, in `g.post$fac2`). Plot Region 1:

```{r}
g.post$fac1[[1]]
```

Plot Pack 5 with a different color palette.
```{r}
# note the 'ggplot2::' is only necessary for building this vignette
# in your code you can simply load ggplot2 with library(ggplot2)
g.post$fac2[[5]] + 
  ggplot2::scale_fill_manual(values=c("#999999", "#E69F00", "#56B4E9")) + 
  ggplot2::scale_color_manual(values=c("#999999", "#E69F00", "#56B4E9"))
```

## Continuous effect

The [Alligator Example](https://github.com/brianstock/MixSIAR/blob/master/inst/example_scripts/mixsiar_script_alligator.R) has a continuous effect. It takes awhile to run and we only use model 5 (Length) below, so instead of running the entire example you can run only model 5. If you want to run all 8 models (full example), it's a good idea to save the results for later.

```{r eval=F}
rm(list=ls()) # clear wolves ex objects
source(file.path(mixsiar.dir,"example_scripts","mixsiar_script_alligator.R")) # run all 8 models
save.image("where/to/save/output/alligator_short.RData") # specify path, where to save file
```

First change `output_options` to return the `ggplot` objects, then get the posterior density plots
```{r eval=F}
output_options$return_obj = TRUE
g.post <- output_posteriors(jags.mod[[5]], mix[[5]], source[[5]], output_options)
```

In this case there is no `g.post$fac1`, `g.post$fac2`, or `g.post$sig` because there are no fixed/random effects. There is a `g.post$cont`, which holds the plots for the continuous effect, Length.
```{r eval=F}
names(g.post)
```

`g.post$cont` has 4 plots, each of which can be modified:

- proportion vs. continuous variable (Length)
- proportions of the min(Length) alligator
- proportions of the median(Length) alligator
- proportions of the max(Length) alligator

```{r eval=F}
g.post$cont[[1]] +
  ggplot2::theme(legend.position="right")
```

```{r eval=F}
g.post$cont[[2]] +
  ggplot2::scale_fill_grey()
```

```{r eval=F}
g.post$cont[[3]]
```

```{r eval=F}
g.post$cont[[4]]
```

The `plot_continuous_var` function has a couple other options you can modify:

- `alphaCI`: alpha level to use for credible interval width (default = 0.05, 95% CI)
- `exclude_sources_below`: proportion threshhold to include sources in the plot. Some sources can be estimated at very low proportions, and this makes the plot less clear. Setting `exclude_sources_below = 0.1` will remove sources with *median* proportion less than 10% for *all values of the continuous variable*.

```{r eval=F}
g.cont <- plot_continuous_var(jags.1, mix, source, output_options, alphaCI=0.05, exclude_sources_below=0.1)
g.cont
```

Compare to plot using 80% CI:

```{r eval=F}
g.cont80 <- plot_continuous_var(jags.1, mix, source, output_options, alphaCI=0.2)
g.cont80
```