--- title: "Getting started" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) knitr::opts_chunk$set(eval = reticulate::py_module_available("anndata")) ``` The API of anndata for R is very similar to its Python counterpart. Check out `?anndata` for a full list of the functions provided by this package. `AnnData()` stores a data matrix `X` together with annotations of observations `obs` (`obsm`, `obsp`), variables `var` (`varm`, `varp`), and unstructured annotations `uns`. Here is an example of how to create an AnnData object with 2 observations and 3 variables. ```{r} library(anndata) ad <- AnnData( X = matrix(1:6, nrow = 2), obs = data.frame(group = c("a", "b"), row.names = c("s1", "s2")), var = data.frame(type = c(1L, 2L, 3L), row.names = c("var1", "var2", "var3")), layers = list( spliced = matrix(4:9, nrow = 2), unspliced = matrix(8:13, nrow = 2) ), obsm = list( ones = matrix(rep(1L, 10), nrow = 2), rand = matrix(rnorm(6), nrow = 2), zeros = matrix(rep(0L, 10), nrow = 2) ), varm = list( ones = matrix(rep(1L, 12), nrow = 3), rand = matrix(rnorm(6), nrow = 3), zeros = matrix(rep(0L, 12), nrow = 3) ), uns = list( a = 1, b = data.frame(i = 1:3, j = 4:6, value = runif(3)), c = list(c.a = 3, c.b = 4) ) ) ad ``` You can read the information back out using the `$` notation. ```{r} ad$X ad$obs ad$obsm[["ones"]] ad$layers[["spliced"]] ad$uns[["b"]] ``` ### Reading / writing AnnData objects Read from h5ad format: ```{r, eval = FALSE} read_h5ad("pbmc_1k_protein_v3_processed.h5ad") ``` ### Creating a view You can use any of the regular R indexing methods to subset the `AnnData` object. This will result in a 'View' of the underlying data without needing to store the same data twice. ```{r} view <- ad[, 2] view view$is_view ad[, c("var1", "var2")] ad[-1, ] ``` ### AnnData as a matrix The `X` attribute can be used as an R matrix: ```{r} ad$X[, c("var1", "var2")] ad$X[-1, , drop = FALSE] ad$X[, 2] <- 10 ``` You can access a different layer matrix as follows: ```{r} ad$layers["unspliced"] ad$layers["unspliced"][, c("var2", "var3")] ``` ### Note on state If you assign an AnnData object to another variable and modify either, both will be modified: ```{r} ad2 <- ad ad$X[, 2] <- 10 list(ad = ad$X, ad2 = ad2$X) ``` This is standard Python behaviour but not R. In order to have two separate copies of an AnnData object, use the `$copy()` function: ```{r} ad3 <- ad$copy() ad$X[, 2] <- c(3, 4) list(ad = ad$X, ad3 = ad3$X) ```