CSAMA 2018 – Statistical Data Analysis for Genome Biology

CSAMA 2018 (16th edition)
Statistical Data Analysis for Genome Scale Biology
Bressanone-Brixen, Italy (South Tyrol Alps)
July 8-13, 2018

Lecturers:

  • Vincent J. Carey, Harvard Medical School
  • Laurent Gatto, University of Cambridge
  • Robert Gentleman, 23andMe, Mountain View
  • Laleh Haghverdi, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Wolfgang Huber, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Michael I. Love, University of North Carolina-Chapel Hill
  • Martin Morgan, Roswell Park Comprehensive Cancer Center, Buffalo
  • Johannes Rainer, European Academy of Bozen (EURAC)
  • Charlotte Soneson, University of Zurich
  • Levi Waldron, CUNY School of Public Health at Hunter College, New York

Teaching Assistants:

  • Simone Bell, EMBL, Heidelberg
  • Vladislav Kim, EMBL, Heidelberg
  • Lori Shepherd, RPCCC, Buffalo
  • Mike L. Smith, EMBL, Heidelberg

The one-week intensive course Statistical Data Analysis for Genome-Scale Biology teaches statistical and computational analysis of multi-omics studies in biology and biomedicine. It covers the underlying theory and state of the art (the morning lectures) and practical hands-on exercises based on the R / Bioconductor environment (the afternoon labs). At the end of the course, you should be able to run analysis workflows on your own (multi-)omic data, adapt and combine different tools, and make informed and scientifically sound choices about analysis strategies.

Topics include:

  • Introduction to R and Bioconductor
  • The elements of statistics: hypothesis testing, multiple testing, regression, regularization, clustering and classification, parallelization and performance (machine learning), visualisation
  • RNA-Seq data analysis
  • Computing with sequences and genomic intervals
  • Working with annotation – genes, genomic features, variants, transcripts and proteins
  • Gene set enrichment analysis
  • Mass spec proteomics and metabolomics
  • Basis of microbiome analysis
  • Experimental design, batch effects and confounding
  • Reproducible research and workflow authoring with R markdown
  • Package development, version control and developer tools (incl. git, github, RStudio)
  • Working with large data: performance parallelisation and cloud computing

The course consists of

  • morning lectures: 20 x 45 minutes: Monday to Friday 8:30h – 12:00h
  • 4 practical computer tutorials in the afternoons (13:30h – 16:30h) on Monday, Tuesday, Thursday and Friday

Visit the course’s website at: http://www.huber.embl.de/csama

Postdoc, PhD and internship positions

We are continually inviting applications for postdoc, PhD and internship positions. You can apply for one of two tracks:

  1. Method development in statistical computing and bioinformatics,
  2. Biological discovery through integrative data analysis (“dry biology”)

For track 1, you will have strong quantitative and analytical skills, such as acquired through a degree in mathematics, statistics, physics, computer science or a related field. You have curiosity and motivation to work in interdisciplinary projects, which include generation of new data and their analysis, and are eager to get to grips with relevant areas of biology and the technologies used in biology research. You will have experience in scientific computing and be familiar with one or several computer languages. Familiarity with R is definitively a plus.

For track 2, you will have a training in life sciences and strong coding skills that enable you to undertake complex data transformations, integrative operations, applications of mathematical models and visualizations. You are driven by making fundamental discoveries by mining cutting-edge, large data sets.

To apply, please contact Wolfgang with your CV, a brief statement of research interests, and examples of your work: besides your publications, this can include theses, research reports, talk slides, software projects (e.g. R packages, github projects) or data analysis reports (e.g. markdown reports or Jupyter notebooks).

Here are some keywords and a non-exhaustive list of collaboration partners with whom we work frequently on new, exciting data types:

  • Latent spaces and manifolds estimation from multi-modal single cell data
  • Genotype-drug interactions, precision oncology, multivariate biomarker discovery
  • Imaging-based phenotyping
  • Bioconductor
  • Thorsten Zenz – pharmacogenomics of drug response in blood cancer
  • Sascha Dietrich – systems medicine of cancer drugs
  • Lars Steinmetz – systems genetics & ‘omics technology development
  • Michael Boutros – high-throughput genetics, genetic interactions & synthetic lethality in cancer
  • Henrik Kaessmann – evolution of cell types