CSAMA 2018 – Statistical Data Analysis for Genome Biology

CSAMA 2018 (16th edition)
Statistical Data Analysis for Genome Scale Biology
Bressanone-Brixen, Italy (South Tyrol Alps)
July 8-13, 2018


  • Vincent J. Carey, Harvard Medical School
  • Laurent Gatto, University of Cambridge
  • Robert Gentleman, 23andMe, Mountain View
  • Laleh Haghverdi, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Wolfgang Huber, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Michael I. Love, University of North Carolina-Chapel Hill
  • Martin Morgan, Roswell Park Comprehensive Cancer Center, Buffalo
  • Johannes Rainer, European Academy of Bozen (EURAC)
  • Charlotte Soneson, University of Zurich
  • Levi Waldron, CUNY School of Public Health at Hunter College, New York

Teaching Assistants:

  • Simone Bell, EMBL, Heidelberg
  • Vladislav Kim, EMBL, Heidelberg
  • Lori Shepherd, RPCCC, Buffalo
  • Mike L. Smith, EMBL, Heidelberg

The one-week intensive course Statistical Data Analysis for Genome-Scale Biology teaches statistical and computational analysis of multi-omics studies in biology and biomedicine. It covers the underlying theory and state of the art (the morning lectures) and practical hands-on exercises based on the R / Bioconductor environment (the afternoon labs). At the end of the course, you should be able to run analysis workflows on your own (multi-)omic data, adapt and combine different tools, and make informed and scientifically sound choices about analysis strategies.

Topics include:

  • Introduction to R and Bioconductor
  • The elements of statistics: hypothesis testing, multiple testing, regression, regularization, clustering and classification, parallelization and performance (machine learning), visualisation
  • RNA-Seq data analysis
  • Computing with sequences and genomic intervals
  • Working with annotation – genes, genomic features, variants, transcripts and proteins
  • Gene set enrichment analysis
  • Mass spec proteomics and metabolomics
  • Basis of microbiome analysis
  • Experimental design, batch effects and confounding
  • Reproducible research and workflow authoring with R markdown
  • Package development, version control and developer tools (incl. git, github, RStudio)
  • Working with large data: performance parallelisation and cloud computing

The course consists of

  • morning lectures: 20 x 45 minutes: Monday to Friday 8:30h – 12:00h
  • 4 practical computer tutorials in the afternoons (13:30h – 16:30h) on Monday, Tuesday, Thursday and Friday

Visit the course’s website at: http://www.huber.embl.de/csama