The latest stable version can be installed from CRAN.
install.packages('OmicFlow', dependencies = TRUE)The development version is available on GitHub.
install.packages('pak') # if not yet installed
pak::pak('agusinac/OmicFlow')OmicFlow expects your sample metadata to follow a simple, but strict structure so that all datasets are compatible and validated up‑front. Sample metadata can be supplied as a CSV/TSV file or as a data.table in R. In both cases the sample metadata should contain a header (this is your first line if you supply a file) where each row = one sample Additional column names not mentioned here are allowed and will be ignored during the metadata validation step.
SAMPLE_ID➡ every row must have a unique, non‑empty sample identifier.- No spaces are allowed in IDs — use underscores
_or dashes-instead.
Example:
| SAMPLE_ID | SAMPLEPAIR_ID | CONTRAST_Treatment | VARIABLE_Age |
|---|---|---|---|
| S1 | P1 | Drug | 42 |
| S2 | P1 | Placebo | 36 |
| S3 | P2 | Drug | 51 |
| Column | Type | Rules |
|---|---|---|
SAMPLE_ID |
string | Unique, no spaces, one per sample row |
| Column | Type | Rules |
|---|---|---|
FEATURE_ID |
string | Optional — no spaces. Naming of the feature identifiers to include or exclude certain features |
SAMPLEPAIR_ID |
string | Optional — no spaces. Use when samples are paired and belong to an individual source/subject |
You can define extra variables using special prefixes:
CONTRAST_...→ grouping/category labels used in differential comparisons
Example:CONTRAST_Treatmentwith valuesDrug/PlaceboVARIABLE_...→ numeric or string variables for statistical analysis
Example:VARIABLE_Agewith values42,51, etc.
The pattern-based columns are only used during the autoFlow function. At the moment only columns with prefix CONTRAST_ are supported.
Example: Outputs a report.html file in the current working directory
taxa$autoFlow(
normalize = FALSE,
weighted = TRUE,
pvalue.threshold = 0.05
)Note
Make sure your metadata meets the requirements!
Only the metagenomics class supports biom files in both HDF5 (version 2) as JSON data structure to be passed via biomData. The proteomics class only supports the countData and featureData. The treeData is optional in both omics sub-classes and when supplied, both the rows of the countData as featureData will be aligned by the tree tip labels.
library("OmicFlow")
metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
tree_file <- system.file("extdata", "tree.newick", package = "OmicFlow")
taxa <- metagenomics$new(
metaData = metadata_file,
countData = counts_file,
featureData = features_file,
treeData = tree_file
)
taxa$feature_subset(Kingdom == "Bacteria")
taxa$normalize()
# Access variables directly
taxa$metaData
taxa$countData
taxa$featureData
taxa$treeData
# Inspect what functions variables are available to the class
str(taxa)
Note
All visualizations use by default color-blind palettes!
alpha_div <- taxa$alpha_diversity(
col_name = "treatment",
metric = "shannon",
paired = FALSE # If TRUE it performs wilcox signed rank test
)
alpha_div$plotBy default PERMANOVA is applied pairwise against each group within the specified contrast, via group_by that is used in pairwise_adonis. The permutation design in vegan::adonis2 is by default set to free. But this may not always be the right test when you have paired samples and you also want to restrict permutations between different sites or genders. Therefore, pairwise_adonis supports a custom permutation design, which can be constructed via permute and fed into vegan::adonis2 as a function via pairwise_adonis with the flag perm_design. See the examples below.
set.seed(1970)
# Perform ordinations with in-built distance matrix computation
#--------------------------------------------------------------------------------
beta_div <- taxa$ordination(
metric = "unifrac",
method = "pcoa",
group_by = "treatment",
perm = 999
)
# Add a custom pre-computed distance matrix
#--------------------------------------------------------------------------------
qiime_unifrac <- data.table::fread("weighted-unifrac-matrix.tsv", header=TRUE)
distmat <- Matrix::Matrix(as.matrix(qiime_unifrac[, .SD, .SDcols = !c("V1")]))
rownames(distmat) <- colnames(distmat)
distmat <- distmat[taxa$metaData[["SAMPLE_ID"]], taxa$metaData[["SAMPLE_ID"]]]
distmat <- as.dist(distmat)
beta_div <- taxa$ordination(
distmat = distmat,
method = "pcoa",
group_by = "treatment",
perm = 999
)
# Add a custom permutation design via `perm_design`
#--------------------------------------------------------------------------------
## taxa$ordination() automatically will input taxa$metaData inside the supplied function.
perm_design_func <- function(meta) {
base::with(
data = meta,
expr = permute::how(
nperm = 999,
plots = permute::Plots(meta$SAMPLEPAIR_ID, type = "none"), # In case samplepair ids is supplied
within = permute::Within(type = "free")
)
)
}
beta_div <- taxa$ordination(
metric = "unifrac",
method = "pcoa",
group_by = "treatment",
perm_design = perm_design_func
)
patchwork::wrap_plots(
beta_div[c("scree_plot", "anova_plot", "scores_plot")],
nrow = 1)res <- taxa$composition(
feature_rank = "Genus",
feature_filter = c("uncultured"),
feature_top = 15,
normalize = FALSE,
col_name = "CONTRAST_sex"
)
composition_plot(
data = res$data,
palette = res$palette,
feature_rank = "Genus",
# If group_by = NULL, then a stacked barplot for each sample sorted alphabetically will be visualized.
group_by = "CONTRAST_sex"
)The volcano_plot will contain the average percentage abundance for each Genus between the two contrasts. Additional parameters can be used to only filter for relevant bacteria based on the pvalue.threshold, foldchange.threshold and abundance.threshold. The returned p-values can be adjusted and used for a new volcano plot via OmicFlow::volcano_plot.
res <- taxa$DFE(
feature_rank = "Genus",
feature_filter = c("uncultured"),
paired = FALSE,
normalize = FALSE,
condition.group = "CONTRAST_sex",
condition_A = "male",
condition_B = "female"
)
res$volcano_plotNote
Symbolic links do not work with mounting, please only copy the original file!
Example: Outputs a report.html file in current work directory
docker pull agusinac/autoflow:1.4.0
docker run -it --rm -v \
"$(pwd)":/data \ # Mount the data in a temporary directory
-w /data \ # set working directory
-u $(id -u):$(id -g) \ # non-root user
agusinac/autoflow:1.4.0 \
autoflow \ # autoflow R script
-b /data/biom_with_taxonomy_hdf5.biom \
-m /data/metadata.tsvIf you are having issues, please create a ticket