Differential Expression & Splicing
SpliceVI supports both differential gene expression (DE) and differential splicing (DS) analysis between groups of cells. Both use the same underlying Bayesian framework from scvi-tools — they differ only in what quantity is being compared and how the effect size is defined. The API mirrors MultiVI, with differential_splicing and get_normalized_splicing replacing the chromatin accessibility equivalents.
For full background on the statistical framework, see the scvi-tools differential expression guide.
How it works
Both DE and DS use _de_core from scvi-tools internally. The key idea is:
- Sample normalized values from the posterior for each cell in group 1 and group 2 separately, by passing cells through the encoder and then the relevant decoder (expression or splicing)
- Compute a per-feature effect size from those posterior samples
- Estimate the posterior probability that the effect size exceeds a threshold \(\delta\) (the
"change"mode)
The per-feature Bayes factor is then:
An FDR-controlled call of differential features is made using a target FDR threshold (default 5%).
Differential Expression
DE compares normalized gene expression between two groups using get_normalized_expression (see Imputed Splicing & Expression).
The effect size follows the standard scVI convention — a log-fold change:
where \(\hat{x}^{(k)}_g\) is the posterior mean normalized expression for gene \(g\) in group \(k\).
de_results = model.differential_expression(
adata=mdata,
groupby="cell_type",
group1="Neuron",
group2="Astrocyte",
delta=0.25,
fdr_target=0.05,
)
Differential Splicing
DS compares junction usage (PSI) between two groups. Because PSI is already on a \([0, 1]\) probability scale, the effect size is a direct difference rather than a log-fold change — analogous to how ATAC-seq accessibility scores are handled in scvi-tools:
where \(\hat{\psi}^{(k)}_j\) is the posterior mean PSI for junction \(j\) in group \(k\), computed using the DM posterior mean by default (norm_splicing_function="dm_posterior_mean"). You can switch to the raw decoder output with norm_splicing_function="decoder".
ds_results = model.differential_splicing(
adata=mdata,
groupby="cell_type",
group1="Neuron",
group2="Astrocyte",
delta=0.10,
fdr_target=0.05,
norm_splicing_function="dm_posterior_mean", # recommended
)
Output columns
| Column | Description |
|---|---|
proba_ds |
Posterior probability of differential splicing |
is_ds_fdr |
Boolean FDR-controlled call at fdr_target |
bayes_factor |
Log Bayes factor |
effect_size |
\(\hat{\psi}^{(2)} - \hat{\psi}^{(1)}\) (model posterior means) |
emp_effect |
Empirical \(\bar{\psi}^{(2)} - \bar{\psi}^{(1)}\) from observed PSI values |
est_prob1 / est_prob2 |
Model posterior mean PSI per group |
emp_prob1 / emp_prob2 |
Empirical mean PSI per group (observed cells only) |
n_obs_group1 / n_obs_group2 |
Number of cells with observed data per junction per group |
Notes
- Very sparse junctions (low
n_obs_group1/2) will have wider posteriors — consider filtering before interpretation - Both methods support
batch_correction=Trueto marginalize over batch effects when comparing groups