Approach & Outputs

We exploit new data types and new types of experiments and studies by developing the computational techniques needed to turn raw data into biology.

Modern Statistics for Modern Biology textbook, with Susan Holmes: online version. There is also a print version published by CUP.

Modern Statistics for Modern Biology textbook, with Susan Holmes: online version. There is also a print version published by CUP.

Cellular neighborhood analysis of healthy and malignant lymph nodes based on single-cell resolution spatial proteomics by multiplexed immunohistochemistry.

Cellular neighborhood analysis of healthy and malignant lymph nodes based on single-cell resolution spatial proteomics by multiplexed immunohistochemistry.

Cluster-free differential expression analysis of sc-RNA-seq data using LEMUR. Paper link.

Cluster-free differential expression analysis of sc-RNA-seq data using LEMUR. Paper link.

Comparison of transformations for single-cell RNA-seq data. Paper link.

Comparison of transformations for single-cell RNA-seq data. Paper link.

Ternary plots of relative sensitivities to targeted kinase inhibitors for a cohort of primary tumour samples of chronic lymphocytic leukaemia (CLL). Paper link.

Ternary plots of relative sensitivities to targeted kinase inhibitors for a cohort of primary tumour samples of chronic lymphocytic leukaemia (CLL). Paper link.

Spatial omics and imaging

Modern spatial omics techniques measure the spatial distribution of tens of thousands distinct molecules, with major limiting factors being sampling efficiency and spatial resolution. Photon microscopy observes one or a few distinct molecules at resolutions of tens of nanometers. Electron microscopy measures e\(^-\) densities at resolutions of a few angstroms. But how to bring this all together? How to navigate huge Terabyte-scale maps of cells and organisms? How to decipher biological functions and processes, associate them with phenotypes in health and disease, and exploit this for better understanding fundamental biology and advance biotechnology and biomedicine?

Functional precision medicine and immuno-oncology

We integrate observational ’omics data, interventional clinical data, and systematic genetic or chemical perturbation data on (ex-vivo) model systems to decipher the molecular mechanisms of variable sensitivity and resistance of tumors to treatments (precision oncology collaboration with Thorsten Zenz at University Hospital Zurich), and to understand the role of the immune system and the tumour microenvironment in tumourigenesis, progression and treatment (systems immunology collaboration with Sascha Dietrich at University Hospital Düsseldorf).

Open science

As we engage with new data types, we aim to develop high-quality computational methods of wide applicability. We consider the release and maintenance of scientific software an integral part of doing science. We contribute to the Bioconductor project, an open source software collaboration to provide tools for the analysis and understanding of genome-scale data. An example is our DESeq2 package for analyzing count data from high-throughput sequencing.

Mentoring and career development

Science is an intellectual adventure and a creative process done by people. For each of us, our work is at the same time, a means to achieve a scientific goal, a job that enables us pay our bills, and a stage of training and professional development. This includes student internships, BSc/MSc theses, PhD theses, postdoctoral projects. The group, and EMBL more generally, offers a well-established mentoring framework to support these triple objectives. Former group members have moved on to rewarding careers: professors, independent group leaders, senior management or professional scientist roles in industry.

Teaching

We maintain the textbook Modern Statistics for Modern Biology by Susan Holmes and Wolfgang Huber. The book is available online, for free, as HTML. It was published as a printed book in 2019 by Cambridge University Press.

We run the annual summer school CSAMA—Biological Data Science. It usually takes place in June in Brixen/Bressanone. Here is the webpage of the 2025 edition. See here for some impressions.

In July 2023, 2024 and 2025, we co-organized the Ukrainian Biological Data Science Summer School in Uzhhorod, Ukraine. See also Wolfgang’s post about it.

We develop publicly available interactive training materials on statistical methods.

Software

We are a frequent contributor to the Bioconductor project

LEMUR Cluster-free differential expression analysis of multi-condition single-cell data using Latent Embedding Multivariate Regression
MOFA Multi-Omics Factor Analysis
DESeq2 Differential gene expression analysis based on the negative binomial distribution
IHW Multiple testing and false discovery rate (FDR) control by Independent Hypothesis Weighting
EBImage Image processing and analysis toolbox for R
Rarr Read Zarr Files in R
rhdf5 R Interface to HDF5
vsn Normalization and variance stabilizing transformation of fluorescence intensity data
cellHTS2 Analysis of cell-based high-throughput screens
DEXSeq Inference of differential exon usage in RNA-Seq
HilbertVis Visualize long vectors of data using Hilbert curves
Python
HTSeq Processing and analyzing data from high-throughput sequencing assays
SOFA Semi-supervised (Multi) Omics Factor Analysis
spatialproteomics lightweight wrapper around xarray to facilitate processing, exploration and analysis of multiplexed immunohistochemistry data