---
title: "FAQ"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{FAQ}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# General questions
**GQ1: Tell me in 10 lines how to use this package.**
**GA1:** Get the dependency graph of several R packages on CRAN or Github at a specific snapshot date(time)
```r
graph <- resolve(c("crsh/papaja", "rio"), snapshot_date = "2019-07-21")
```
Dockerize the dependency graph to a directory
```r
dockerize(graph, output_dir = "rangtest")
```
You can build the Docker image either by the R package `stevedore` or Docker CLI client. We use the CLI client.
```sh
docker build -t rangimg ./rangtest ## might need sudo
```
Launch the container with the built image
```sh
docker run --rm --name "rangcontainer" -ti rangimg
```
And the tenth line is not needed.
**GQ2: For running `resolve()`, how do I know which packages are used in a project?**
**GA2:** `rang` >= 0.2 supports scanning of a directory for R packages (the current working directory by default). `snapshot_date` is inferred from the latest modification date of all files.
```r
resolve()
```
A better strategy, however, is to do the scanning first and then manually review which packages are from non-CRAN sources.
```r
pkgs <- as_pkgrefs(".")
```
**GQ3: Why is the R script generated by `dockerize()` and `export_rang()` so strange/unidiomatic/inefficient/did you guys read `fortunes::fortune("answer is parse")`?**
**GA3:** It is because we optimize the R code in `rang.R` for backward compatibility. We need to make sure that the code runs well in vanilla R environments since 1.3.1.
**GQ4: Why doesn't `rang` support reconstructing computational environments with R < ~~2.1.0~~ 1.3.1 yet?**
**GA4:** ~~It is because installing source packages from within R was introduced in R 2.1.0. Before that one needed to install source packages with `R CMD INSTALL`. But we are working on supporting R in the 1.x series.~~ Support for R 1.x series is available in rang >= 0.2. But R version older than 1.3.1 is still not supported because we haven't found a effectiveness way to automatically compile R < 1.3.1.
**GQ5: Does `rang.R` (generated by `export_rang()` or `dockerize()`) run on non-Linux OSes?**
**GA5:** Theoretically speaking, yes. But strongly not recommended. If the system requirements are fulfilled, `rang.R` should probably run fine on OS X if the R packages do not contain compiled code. C and Fortran compilers are needed if it is the case. See [this entry](https://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#Installation-of-source-packages) in R Mac OS X FAQ. On Windows, installing Github packages requires properly set up PATH and `tar`. Similarly, R packages with compiled code require C / Fortran compilers. See [this entry](https://cran.r-project.org/bin/windows/base/rw-FAQ.html#Can-I-install-packages-into-libraries-in-this-version_003f) in R for Windows FAQ.
**GQ6: What are the caveats of using rang?**
**GA6:** Many
* `rang` does not support reconstructing computational environments with R < 1.3.1 (i.e. `snapshot_date` < "2001-08-31 14:58") yet
* `dockerize()` can only generate Debian/Ubuntu-based Docker images; it also means that packages depending on non-Linux specific features (e.g. WinBUGS) do not work.
* `dockerize(cache = TRUE)` does not cache ~~[R source code](https://cran.r-project.org/src/base/) (yet) and~~ (available in rang >= 0.2.1) System Requirements (in `deb` packages)
* `query_sysreqs()` (as well as `resolve(query_sysreqs = TRUE)`) queries for System Requirements based on the latest version of the packages on CRAN / Github. Therefore:
* Removed CRAN packages are assumed to have no System Requirements
* R Packages with changed System Requirements between `snapshot_date` and the date of running `resolve()` might produce incorrect System Requirements
* A result from `resolve()` in the following cases must be dockerized with caching (i.e. `dockerize(cache = TRUE)`)
* R version < 3.1 and has at least one Github package. It is because the outdated version of Debian cannot communicate with the Github API
* R version < 3.3 and has at least one Bioconductor package, same reason.
* Has at least one local package.
* R version < 2.1
* R packages on Github, CRAN, and Bioconductor might not be available in the near future (Github: likely; CRAN and Bioconductor: very unlikely). But one can cache the packages (`dockerize(cache = TRUE)`).
* The Rocker project and its host Docker Hub might not be available in the near future (unlikely)
* Ubuntu / Debian archives (for System Requirements) might not be available in the future (super unlikely)
**GQ7: `rang` depends on R >= 3.5.0. Several of the dependencies depend on many modern R packages. How dare you claiming your package supports R >= 1.3.1?**
**GA7:** To clarify, it is true that `resolve()` and `dockerize()` depend on many factors, including a modern version of R. But the reconstruction process (if with caching of R packages) depends only on the availability of Docker images from Docker Hub, availability of R source code on CRAN (R < 3.1.0), and `deb` packages from Ubuntu and Debian in the future. If you don't believe in all of these, see also: DQ4.
**GQ8: What are the data sources of `resolve()`?**
**GA8:** Several
* Dependencies / R version / System Requirements of CRAN packages: r-hub APIs [pkgsearch](https://r-hub.github.io/pkgsearch/) [r-versions](https://api.r-hub.io/rversions) [sysreqs](https://sysreqs.r-hub.io/)
* Github: [Github API](https://docs.github.com/en/rest)
* Dependencies of Bioconductor packages: [Bioconductor](https://bioconductor.org/)
**GQ9: I am not convinced by this package. What are the alternatives?**
**GA9:** If you don't consider the Dockerization part of `rang`, the date-based pinning of R packages can be done by:
* Using [Posit Public Package Manager](https://packagemanager.rstudio.com/)
```r
library(pak)
options(repos = c(REPO_NAME = "https://packagemanager.rstudio.com/cran/2019-07-21"))
pkg_install("rio")
pkg_install("crsh/papaja")
```
* Using [groundhog](https://groundhogr.com/)
```r
library(groundhog)
pkgs <- c("rio","crsh/papaja")
groundhog.library(pkgs, "2019-07-21")
```
If you don't consider the date-based pinning of R packages, the Dockerization can be done by:
* Using [containerit](https://github.com/o2r-project/containerit) [not on CRAN]
```r
library(containerit)
## combine with Package Manager to pin packages by date
install.packages("rio")
remotes::install_github("crsh/papaja")
library(rio)
library(papaja)
print(containerit::dockerfile(from = utils::sessionInfo()))
```
* Using [dockerfiler](https://CRAN.R-project.org/package=dockerfiler)
```r
library(dockerfiler)
my_dock <- Dockerfile$new()
## combine with Package Manager to pin packages by date
my_dock$RUN(r(install.packages(c("remotes", "rio"))))
my_dock$RUN(r(remotes::install_github("crsh/papaja")))
my_dock
```
**GQ10: I want to know more about this package.**
**GA10:** Good. Read our [preprint](https://arxiv.org/abs/2303.04758).
# Docker questions
**DQ1: Is Docker an overkill to simply ensure that a few lines of R code are reproducible?**
**DA1:** It might be the case for recent R code, e.g. R >= 3.0 (or `snapshot_date` > "2013-04-03 09:10"). But we position `rang` as an archaeological tool to run really old R code (`snapshot_date` >= "2005-04-19 09:01", but see GQ4). For this, Docker is essential because R in the 2.x/1.x series might not be installable anymore in a non-virtualized environment.
According to [The Turing Way](https://the-turing-way.netlify.app/reproducible-research/compendia.html), a research compendium that aids computational reproducibility should contain a complete description of the computational environment. The directory exported by `dockerize()`, especially when `materials_dir` and `cache` were used, can be directly shared as a research compendium.
**DQ2: How do I access bash instead of R?**
**DA2:** By default, containers launched with the images generated by `rang` goes to R. One can override this by launching the container with an alternative entry point.
Suppose an image was built as per GA1.
```sh
docker run --rm --name "rangcontainer" --entrypoint bash -ti rangimg
```
**DQ3: How do I copy files from and to a launched container?**
**DA3:** Again an image was built as per GA1 and launched as below
```sh
docker run --rm --name "rangcontainer" -ti rangimg
```
```sh
# probably you need to run this from another terminal
docker cp rangcontainer:/rang.R rang2.R
docker cp rang2.R rangcontainer:/rang2.R
```
We want to emphasize here that launching a container with `--name` is useful because the name of the container is randomly generated when `--name` was not used to launch it. It is also important to remind you that a relaunched container goes back to the initial state. Any file generated inside the container previously will be removed. So use `docker cp` to copy any artifact if one wants to preserve any artifact.
**DQ4: How do I back up an image?**
**DA4:** If you don't believe Docker Hub / Debian archives / Ubuntu archives would be available forever, you may back up the generated image.
```sh
docker save rangimg | gzip > rangimg.tar.gz
```
You can also share the back up gzipped tarball file (usually < 1G, depending on the size of `materials_dir`, thus shareable on Zenodo).
To restore the backup image:
```sh
docker load < rangimg.tar.gz
```
And launch a container the same way
```sh
docker run --rm --name "rangcontainer" -ti rangimg
```
# Apptainer/Singularity questions
**AQ1: I am on HPC and I don't have Docker there. Can I use Apptainer/Singularity instead of Docker?**
**AA1:** Docker may require root privileges and is not usually available on HPC. You might have Singularity or Apptainer instead. Apptainer/Singularity do not require root to run images (only to build them). You can build images on your own Linux PC (or in VirtualBox on Windows or macOS), or on a virtual private server, or also for [free in the cloud](https://cloud.sylabs.io/builder).
You have two options:
1. You can prepare (using `dockerize()`) and build a Docker image and convert it to an Apptainer/Singularity image. See [Apptainer/Singularity documentation](https://apptainer.org/docs/user/latest/docker_and_oci.html) for that.
2. You can use `apptainerize()` function just like you would use `dockerize()`.
```r
apptainerize(graph, output_dir = "rangtest")
```
Afterwards you build an image:
```sh
cd rangtest
apptainer build container.sif container.def
# sudo singularity build container.sif container.def # same as above
```
And run the container:
```sh
apptainer exec container.sif R
# singularity exec container.sif R # same as above
```
Then stop the container when you are done with it just quit R.
`apptainer` and `singularity` shell commands are interchangeable, at least for now. See [Apptainer Singularity compatibility](https://apptainer.org/docs/user/latest/singularity_compatibility.html) for details.
`apptainerize()`/`singularize()` functions work exactly the same as `dockerize()`, except you cannot cache Linux distribution rootfs.
**AQ2: What if I want to run RStudio IDE in a container instead of just CLI R?**
**AA2:** To run RStudio IDE in Apptainer/Singularity container, some writeable folders and a config file have to be created locally:
```bash
mkdir -p run var-lib-rstudio-server .rstudio
printf 'provider=sqlite\ndirectory=/var/lib/rstudio-server\n' > database.conf
```
After that, you can run the container (do not run as `root` user, otherwise you will not be able to login to RStudio IDE).
Start instance (on default RSTUDIO port 8787):
```bash
apptainer instance start \
--bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,.rstudio:/home/rstudio/.rstudio/ \
container.sif \
rangtest
```
Now open a browser and go to localhost:8787.
The default username is your local username, default password is 'set_your_password' (if you are using container generated by rang).
List running instances:
```bash
apptainer instance list
```
Stop instance:
```bash
apptainer instance stop rangtest
```
Start instance with custom port (e.g. 8080) and password:
```bash
apptainer instance start \
--env RPORT=8080
--env PASSWORD='set_your_password' \
--bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,.rstudio:/home/rstudio/.rstudio/ \
container.sif \
rangtest
```
Run container with custom `rserver` command line:
```bash
apptainer exec \
--env PASSWORD='set_your_password' \
--bind run:/run,var-lib-rstudio-server:/var/lib/rstudio-server,database.conf:/etc/rstudio/database.conf,.rstudio:/home/rstudio/.rstudio/ \
container.sif \
/usr/lib/rstudio-server/bin/rserver \
--auth-none=0 --auth-pam-helper-path=pam-helper \
--server-user=$(whoami) --www-port=8787
```
If you run the container using `apptainer exec` command, you will have to kill the `rserver` process manually or Cmd/Ctrl+C from the running container to stop the server.