--- title: "Creating a TI method: Container" date: "`r Sys.Date()`" author: - Wouter Saelens - Robrecht Cannoodt output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Creating a TI method: Container} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r, include = FALSE} dir <- tempdir() knitr::opts_knit$set(root.dir = normalizePath(tempdir(), winslash = '/')) NOT_CRAN <- (Sys.getenv("NOT_CRAN") == "" || identical(tolower(Sys.getenv("NOT_CRAN")), "true")) && # don't run some chunks on CRAN dynwrap::test_docker_installation() # also don't run if docker is not available knitr::opts_knit$get("root.dir") ``` ```{r setup, message = FALSE} library(dynwrap) library(dplyr) ``` Once [you have wrapped a method using a script and definition](../create_ti_method_script), all you need to share your method is a _Dockerfile_ which lists all the dependencies that need to be installed. We'll work with the following _definition.yml_:
definition.yml
```{r, echo = FALSE} definition_string <- paste0(readLines(system.file("examples/docker/definition.yml", package = "dynwrap")), collapse = "\n") readr::write_file(definition_string, "definition.yml") knitr::asis_output(paste0("```yaml\n", definition_string, "\n```")) ``` and _run.py_:
run.py
```{r, echo = FALSE} run_py_script <- paste0(readLines(system.file("examples/docker/run.py", package = "dynwrap")), collapse = "\n") readr::write_file(run_py_script, "run.py") knitr::asis_output(paste0("```python\n", run_py_script, "\n```")) ``` Make sure it is executable. ```{bash, eval=NOT_CRAN} chmod +x run.py ``` Assuming that the _definition.yml_ and _run.R_ are located in current directory, a minimal _Dockerfile_ would look like:
Dockerfile
```{r, echo = FALSE, warning = FALSE} docker_file <- paste0(readLines(system.file("examples/docker/Dockerfile", package = "dynwrap")), collapse = "\n") readr::write_file(docker_file, "Dockerfile") knitr::asis_output(paste0("```Dockerfile\n", docker_file, "\n```")) ``` `dynverse/dynwrappy` is here the base image, which contains the latest version of R, python, dynwrap, dyncli and most tidyverse dependencies. For R methods, you can use the `dynverse/dynwrapr` base. While not required, it's recommended to start from these base images, because dyncli provides an interface to run each method using the docker container from the command line. [As discussed before](../create_ti_method_script), wrapping is also a lot easier using dynwrap.
For reproducibility, it's best to specify the tag of the base image. You can find this these tags on dockerhub: https://hub.docker.com/r/dynverse/dynwrapr/tags.
The _Dockerfile_ then copies over the _definition.yml_ and _run.py_ file inside the "code" directory. Typically, you won't change the locations of these files, simply to maintain consistency with the rest of the method wrappers included in dynmethods. Finally, we specify the entrypoint, which is the script that will be executed when the docker is run.
Do not specify this entrypoint using `ENTRYPOINT /code/run.R`, because this will create issues with specifying command-line arguments.
That's it! Assuming that you have a functioning docker installation, you can build this container using ```{r, eval=NOT_CRAN} system("docker build -t my_ti_method .") ``` ## Testing it out This method can now be loaded inside R using dynwrap ```{r, eval=NOT_CRAN, error=TRUE} method <- create_ti_method_container("my_ti_method") dataset <- dynwrap::example_dataset trajectory <- infer_trajectory(dataset, method(), verbose = TRUE) ``` If you have dynplot installed, you can also plot the trajectory: ```{r, eval=FALSE} library(dynplot) # for now, install from github using: # remotes::install_github("dynverse/dynplot") plot_graph(trajectory) plot_heatmap(trajectory, expression_source = dataset$expression) ``` Congratulations! You now have a TI method that can be easily installed anywhere without dependency issues, and that can be included within the whole dynverse pipeline. So what's left? ## Testing and continuous integration To make a project like this maintainable in the long run, it is important that everytime something is changed, the method is tested to make sure it works fine. ### Creating an example dataset To do this, we first add an _example.sh_ file, which will generate an example dataset that will certainly run without any error with the method. In this case, this example is just the example dataset included in dynwrap.
example.sh
```{r, echo = FALSE} example_script <- "#!/usr/bin/env Rscript dataset <- dynwrap::example_dataset file <- commandArgs(trailingOnly = TRUE)[[1]] dynutils::write_h5(dataset, file) " readr::write_file(example_script, "example.sh") knitr::asis_output(paste0("```R\n", example_script, "\n```")) ```
You can of course provide your own example data here, but make sure it doesn't take too long to generate, and not too long for the TI method run. You can also add extra parameters and a fixed seed to the example data, e.g.: `dataset$seed <- 1` and `dataset$parameters <- list(component = 42)`
Now run the example script and test it: ```{bash, eval=NOT_CRAN} chmod +x example.sh ./example.sh example.h5 ``` ```{r, eval=NOT_CRAN, error=TRUE} dataset <- dynutils::read_h5("example.h5") trajectory <- infer_trajectory(dataset, method()) ``` ### Continuous integration To automate the testing (and building) of containers, you'll have to use continuous integration. Because the exact code to use this continuous integration requires some manual steps, we suggest you create an issue at [dynmethods](https://github.com/dynverse/dynmethods/issues/new) so that we can help you further. Here is an example of one of our CI scripts: [https://github.com/dynverse/ti_paga/blob/master/.github/workflows/make.yml](https://github.com/dynverse/ti_paga/blob/master/.github/workflows/make.yml). Once the continuous integration works, you're method is ready to be included in [dynmethods](https://github.com/dynverse/dynmethods/issues/new)!