FAQ
faqs.Rmd
Frequently Asked Questions
If you have a question not answered here, please raise a discussion on our Github.
Should I use Pseudotime or Velocity data?
Pseudotime trajectories and RNA velocity are both popular tools for
studying cell differentiation and fate decisions. However, both are
useful in specific circumstances and their usefulness varies depending
on the dataset. In general, RNA Velocity is most useful for cell
differentiation that occurs relatively quickly (~hours/days). In
contrast, Pseudotime (e.g. Monocle3
) is more useful for
longer time scale dynamics. It is important to note that, due to their
dependence on dimension reduction, pseudotime-based methods require
sufficient sampling at all intermediate stages of a cell’s
differentiation continuum.
What does negative variance explained indicate?
Variance explained is negative when the variance in the model is greater than the variance in the data, effectively it means that the model is not fitting the data well. Biologically, this could indicate that the genes are not under environmental control, and therefore the observed trajectory dynamics have little or no relationship with activity of available ligands.
I have low or negative variance explained, but the output ligands align well or ‘make sense’ with known biology!
Because ligand importance is a relative value, this could indicate that the output ligands are responsible for a small subset of variance in your data e.g. a small selection of genes, while the majority of gene expression is not influenced by the environment. Because most of the gene expression variance is cell-intrinsic, this could lead to an overall negative variance explained while your ligands still influence a small number of (potentially important) genes.
Overall, negative variance explained does not mean the ligands are automatically wrong. Rather, it means that, across the trajectory, there is more variance attributable to cell-intrinsic or other mechanisms than is attributable to the environment.
What is a ‘good’ variance explained?
A ‘good’ variance explained is difficult to define, particularly as the quality of trajectory or velocity results is so dependent on the assumptions of the underlying data. Biologically, it does not make sense to have a large amount of gene expression (e.g. >30% variance) to be under the influence of environmental factors. The majority of variance in gene expression, from cell to cell, is unlikely to be directly caused by their environment. So depending on your biological system, you could expect anywhere between 1% and 30% of the gene expression variance to be attributable to environmental factors.
Install Errors
Errors installing Monocle3 on an Apple Silicon Mac.
As of Oct 2022, I found Monocle3 quite difficult to install on my
2021 M1 MacBook because some of it’s dependencies did not have
compatibility with Arm64 processors. To get around this, I created an
x86_64 terminal shortcut that runs through Apple’s Rosetta platform,
using this
tutorial. This emulates an x86_64 Intel processor and runs an
x86 version
of conda
in that terminal only. I
also used mamba
to keep the x86_64 packages separate from
the Apple Silicon-compatible packages.
The following assumes you have an x86_64
emulated
terminal, and an x86
version of conda. (I.e., you have
completed the steps in this
tutorial )
In your new x86_64
terminal, type the following
one line at a time.
conda create -n x86_64
conda activate x86_64
conda env config vars set CONDA_SUBDIR=osx-64
export PATH=$base_dir/miniconda/bin:$PATH
conda deactivate
conda activate x86_64
conda install r-seurat r-monocle3 r-devtools r-biocmanager #optionally, you could use mamba which should install everything in less than a minute.
Now open up R and check that it has run smoothly by opening R in your terminal typing the following:
R.Version()$system
R.home()
It should say x86_64
, and R.home() points to an anaconda
directory instead of your default system R. This means you’ve installed
an x86_64
version of R
inside the
x86_64
terminal and it is localized to the conda
environment.
You may also need to install GenomeInfoDb, so type the following in R.
BiocManager::install("GenomeInfoDbData")
Errors installing Monocle3 with R versions greater than 4.1.0
As of July 2023, Monocle3 is not compatible with current versions of
R. To fix this, I did the following using mamba
, the faster
alternative to conda
.
mamba create -n entrain-env
mamba actiate entrain-env
mamba install r-devtools r-r.utils r-base=4.1.0 gdal r-terra r-spdep udunits2
Then I navigated to the conda-meta
directory of my conda
environment. For me, this was located in
~/mambaforge/envs/entrain-env/conda-meta/
. Here, I created
a file called pinned
and wrote a single line:
r-base ==4.1.0
This file simply prevents the R version in the environment from being updated to R 4.3, allowing Monocle to work.
Then, I ran in terminal:
mamba install -c bioconda r-monocle3
Then I typed R
into terminal, and ran the following R
code:
devtools::install_github(c("satijalab/seurat-wrappers", "theimagelab/entrain"))