---
title: "FAQ"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{FAQ}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# General questions

**GQ1: Tell me in 10 lines how to use this package.**

**GA1:** Get the dependency graph of several R packages on CRAN or Github at a specific snapshot date(time)

```r
graph <- resolve(c("crsh/papaja", "rio"), snapshot_date = "2019-07-21")
```

Dockerize the dependency graph to a directory

```r
dockerize(graph, output_dir = "rangtest")
```

You can build the Docker image either by the R package `stevedore` or Docker CLI client. We use the CLI client.

```sh
docker build -t rangimg ./rangtest ## might need sudo
```

Launch the container with the built image

```sh
docker run --rm --name "rangcontainer" -ti rangimg
```

And the tenth line is not needed.

**GQ2: For running `resolve()`, how do I know which packages are used in a project?**

**GA2:** `rang` >= 0.2 supports scanning of a directory for R packages (the current working directory by default). `snapshot_date` is inferred from the latest modification date of all files.

```r
resolve()
```

A better strategy, however, is to do the scanning first and then  manually review which packages are from non-CRAN sources.

```r
pkgs <- as_pkgrefs(".")
```

**GQ3: Why is the R script generated by `dockerize()` and `export_rang()` so strange/unidiomatic/inefficient/did you guys read `fortunes::fortune("answer is parse")`?**

**GA3:** It is because we optimize the R code in `rang.R` for backward compatibility. We need to make sure that the code runs well in vanilla R environments since 1.3.1.

**GQ4: Why doesn't `rang` support reconstructing computational environments with R < ~~2.1.0~~ 1.3.1 yet?**

**GA4:** ~~It is because installing source packages from within R was introduced in R 2.1.0. Before that one needed to install source packages with `R CMD INSTALL`. But we are working on supporting R in the 1.x series.~~ Support for R 1.x series is available in rang >= 0.2. But R version older than 1.3.1 is still not supported because we haven't found a effectiveness way to automatically compile R < 1.3.1.

**GQ5: Does `rang.R` (generated by `export_rang()` or `dockerize()`) run on non-Linux OSes?**

**GA5:** Theoretically speaking, yes. But strongly not recommended. If the system requirements are fulfilled, `rang.R` should probably run fine on OS X if the R packages do not contain compiled code. C and Fortran compilers are needed if it is the case. See [this entry](https://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#Installation-of-source-packages) in R Mac OS X FAQ. On Windows, installing Github packages requires properly set up PATH and `tar`. Similarly, R packages with compiled code require C / Fortran compilers. See [this entry](https://cran.r-project.org/bin/windows/base/rw-FAQ.html#Can-I-install-packages-into-libraries-in-this-version_003f) in R for Windows FAQ.

**GQ6: What are the caveats of using rang?**

**GA6:** Many

* `rang` does not support reconstructing computational environments with R < 1.3.1 (i.e. `snapshot_date` < "2001-08-31 14:58") yet
* `dockerize()` can only generate Debian/Ubuntu-based Docker images; it also means that packages depending on non-Linux specific features (e.g. WinBUGS) do not work.
* `dockerize(cache = TRUE)` does not cache ~~[R source code](https://cran.r-project.org/src/base/) (yet) and~~ (available in rang >= 0.2.1) System Requirements (in `deb` packages)
* `query_sysreqs()` (as well as `resolve(query_sysreqs = TRUE)`) queries for System Requirements based on the latest version of the packages on CRAN / Github. Therefore:
    * Removed CRAN packages are assumed to have no System Requirements
	* R Packages with changed System Requirements between `snapshot_date` and the date of running `resolve()` might produce incorrect System Requirements
* A result from `resolve()` in the following cases must be dockerized with caching (i.e. `dockerize(cache = TRUE)`)
    * R version < 3.1 and has at least one Github package. It is because the outdated version of Debian cannot communicate with the Github API
    * R version < 3.3 and has at least one Bioconductor package, same reason.
    * Has at least one local package.
    * R version < 2.1
* R packages on Github, CRAN, and Bioconductor might not be available in the near future (Github: likely; CRAN and Bioconductor: very unlikely). But one can cache the packages (`dockerize(cache = TRUE)`).
* The Rocker project and its host Docker Hub might not be available in the near future (unlikely)
* Ubuntu / Debian archives (for System Requirements) might not be available in the future (super unlikely)

**GQ7: `rang` depends on R >= 3.5.0. Several of the dependencies depend on many modern R packages. How dare you claiming your package supports R >= 1.3.1?**

**GA7:** To clarify, it is true that `resolve()` and `dockerize()` depend on many factors, including a modern version of R. But the reconstruction process (if with caching of R packages) depends only on the availability of Docker images from Docker Hub, availability of R source code on CRAN (R < 3.1.0), and `deb` packages from Ubuntu and Debian in the future. If you don't believe in all of these, see also: DQ4.

**GQ8: What are the data sources of `resolve()`?**

**GA8:** Several

* Dependencies / R version / System Requirements of CRAN packages: r-hub APIs [pkgsearch](https://r-hub.github.io/pkgsearch/) [r-versions](https://api.r-hub.io/rversions) [sysreqs](https://sysreqs.r-hub.io/)
* Github: [Github API](https://docs.github.com/en/rest)
* Dependencies of Bioconductor packages: [Bioconductor](https://bioconductor.org/)

**GQ9: I am not convinced by this package. What are the alternatives?**

**GA9:** If you don't consider the Dockerization part of `rang`, the date-based pinning of R packages can be done by:

* Using [Posit Public Package Manager](https://packagemanager.rstudio.com/)

```r
library(pak)
options(repos = c(REPO_NAME = "https://packagemanager.rstudio.com/cran/2019-07-21"))
pkg_install("rio")
pkg_install("crsh/papaja")
```

* Using [groundhog](https://groundhogr.com/)

```r
library(groundhog)
pkgs <- c("rio","crsh/papaja")
groundhog.library(pkgs, "2019-07-21")
```

If you don't consider the date-based pinning of R packages, the Dockerization can be done by:

* Using [containerit](https://github.com/o2r-project/containerit) [not on CRAN]

```r
library(containerit)
## combine with Package Manager to pin packages by date
install.packages("rio")
remotes::install_github("crsh/papaja")
library(rio)
library(papaja)
print(containerit::dockerfile(from = utils::sessionInfo()))
```

* Using [dockerfiler](https://CRAN.R-project.org/package=dockerfiler)

```r
library(dockerfiler)
my_dock <- Dockerfile$new()
## combine with Package Manager to pin packages by date
my_dock$RUN(r(install.packages(c("remotes", "rio"))))
my_dock$RUN(r(remotes::install_github("crsh/papaja")))
my_dock
```

**GQ10: I want to know more about this package.**

**GA10:** Good. Read our [preprint](https://arxiv.org/abs/2303.04758).

# Docker questions

**DQ1: Is Docker an overkill to simply ensure that a few lines of R code are reproducible?**

**DA1:** It might be the case for recent R code, e.g. R >= 3.0 (or `snapshot_date` > "2013-04-03 09:10"). But we position `rang` as an archaeological tool to run really old R code (`snapshot_date` >= "2005-04-19 09:01", but see GQ4). For this, Docker is essential because R in the 2.x/1.x series might not be installable anymore in a non-virtualized environment.

According to [The Turing Way](https://the-turing-way.netlify.app/reproducible-research/compendia.html), a research compendium that aids computational reproducibility should contain a complete description of the computational environment. The directory exported by `dockerize()`, especially when `materials_dir` and `cache` were used, can be directly shared as a research compendium.

**DQ2: How do I access bash instead of R?**

**DA2:** By default, containers launched with the images generated by `rang` goes to R. One can override this by launching the container with an alternative entry point.

Suppose an image was built as per GA1.

```sh
docker run --rm --name "rangcontainer" --entrypoint bash -ti rangimg
```

**DQ3: How do I copy files from and to a launched container?**

**DA3:** Again an image was built as per GA1 and launched as below

```sh
docker run --rm --name "rangcontainer" -ti rangimg
```

```sh
# probably you need to run this from another terminal
docker cp rangcontainer:/rang.R rang2.R
docker cp rang2.R rangcontainer:/rang2.R
```

We want to emphasize here that launching a container with `--name` is useful because the name of the container is randomly generated when `--name` was not used to launch it. It is also important to remind you that a relaunched container goes back to the initial state. Any file generated inside the container previously will be removed. So use `docker cp` to copy any artifact if one wants to preserve any artifact.

**DQ4: How do I back up an image?**

**DA4:** If you don't believe Docker Hub / Debian archives / Ubuntu archives would be available forever, you may back up the generated image.

```sh
docker save rangimg | gzip > rangimg.tar.gz
```

You can also share the back up gzipped tarball file (usually < 1G, depending on the size of `materials_dir`, thus shareable on Zenodo).

To restore the backup image:

```sh
docker load < rangimg.tar.gz
```

And launch a container the same way

```sh
docker run --rm --name "rangcontainer" -ti rangimg
```

# Apptainer/Singularity questions

**AQ1: I am on HPC and I don't have Docker there. Can I use Apptainer/Singularity instead of Docker?**

**AA1:** Docker may require root privileges and is not usually available on HPC. You might have Singularity or Apptainer instead. Apptainer/Singularity do not require root to run images (only to build them). You can build images on your own Linux PC (or in VirtualBox on Windows or macOS), or on a virtual private server, or also for [free in the cloud](https://cloud.sylabs.io/builder).

You have two options:

1. You can prepare (using `dockerize()`) and build a Docker image and convert it to an Apptainer/Singularity image. See [Apptainer/Singularity documentation](https://apptainer.org/docs/user/latest/docker_and_oci.html) for that.

2. You can use `apptainerize()` function just like you would use `dockerize()`.

```r
apptainerize(graph, output_dir = "rangtest")
```

Afterwards you build an image:

```sh
cd rangtest
apptainer build container.sif container.def
# sudo singularity build container.sif container.def # same as above
```

And run the container:

```sh
apptainer exec container.sif R
# singularity exec container.sif R # same as above
```

Then stop the container when you are done with it just quit R.


`apptainer` and `singularity` shell commands are interchangeable, at least for now. See [Apptainer Singularity compatibility](https://apptainer.org/docs/user/latest/singularity_compatibility.html) for details.

`apptainerize()`/`singularize()` functions work exactly the same as `dockerize()`, except you cannot cache Linux distribution rootfs.


**AQ2: What if I want to run RStudio IDE in a container instead of just CLI R?**

**AA2:** To run RStudio IDE in Apptainer/Singularity container, some writeable folders and a config file have to be created locally:

```bash
mkdir -p run var-lib-rstudio-server .rstudio
printf 'provider=sqlite\ndirectory=/var/lib/rstudio-server\n' > database.conf
```

After that, you can run the container (do not run as `root` user, otherwise you will not be able to login to RStudio IDE).

Start instance (on default RSTUDIO port 8787):

```bash
apptainer instance start \
    --bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,.rstudio:/home/rstudio/.rstudio/ \
    container.sif \
    rangtest
```

Now open a browser and go to localhost:8787.
The default username is your local username, default password is 'set_your_password' (if you are using container generated by rang).


List running instances:

```bash
apptainer instance list
```

Stop instance:

```bash
apptainer instance stop rangtest
```

Start instance with custom port (e.g. 8080) and password:

```bash
apptainer instance start \
    --env RPORT=8080
    --env PASSWORD='set_your_password' \
    --bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,.rstudio:/home/rstudio/.rstudio/ \
    container.sif \
    rangtest
```

Run container with custom `rserver` command line:

```bash
apptainer exec \
    --env PASSWORD='set_your_password' \
    --bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,.rstudio:/home/rstudio/.rstudio/ \
    container.sif \
    /usr/lib/rstudio-server/bin/rserver \
    --auth-none=0 --auth-pam-helper-path=pam-helper \
    --server-user=$(whoami) --www-port=8787
```

If you run the container using `apptainer exec` command, you will have to kill the `rserver` process manually or Cmd/Ctrl+C from the running container to stop the server.