CSAMA 2018 – Statistical Data Analysis for Genome Biology

CSAMA 2018 (16th edition)
Statistical Data Analysis for Genome Scale Biology
Bressanone-Brixen, Italy (South Tyrol Alps)
July 8-13, 2018

Lecturers:

  • Vincent J. Carey, Harvard Medical School
  • Laurent Gatto, University of Cambridge
  • Robert Gentleman, 23andMe, Mountain View
  • Laleh Haghverdi, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Wolfgang Huber, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Michael I. Love, University of North Carolina-Chapel Hill
  • Martin Morgan, Roswell Park Comprehensive Cancer Center, Buffalo
  • Johannes Rainer, European Academy of Bozen (EURAC)
  • Charlotte Soneson, University of Zurich
  • Levi Waldron, CUNY School of Public Health at Hunter College, New York

Teaching Assistants:

  • Simone Bell, EMBL, Heidelberg
  • Vladislav Kim, EMBL, Heidelberg
  • Lori Shepherd, RPCCC, Buffalo
  • Mike L. Smith, EMBL, Heidelberg

The one-week intensive course Statistical Data Analysis for Genome-Scale Biology teaches statistical and computational analysis of multi-omics studies in biology and biomedicine. It covers the underlying theory and state of the art (the morning lectures) and practical hands-on exercises based on the R / Bioconductor environment (the afternoon labs). At the end of the course, you should be able to run analysis workflows on your own (multi-)omic data, adapt and combine different tools, and make informed and scientifically sound choices about analysis strategies.

Topics include:

  • Introduction to R and Bioconductor
  • The elements of statistics: hypothesis testing, multiple testing, regression, regularization, clustering and classification, parallelization and performance (machine learning), visualisation
  • RNA-Seq data analysis
  • Computing with sequences and genomic intervals
  • Working with annotation – genes, genomic features, variants, transcripts and proteins
  • Gene set enrichment analysis
  • Mass spec proteomics and metabolomics
  • Basis of microbiome analysis
  • Experimental design, batch effects and confounding
  • Reproducible research and workflow authoring with R markdown
  • Package development, version control and developer tools (incl. git, github, RStudio)
  • Working with large data: performance parallelisation and cloud computing

The course consists of

  • morning lectures: 20 x 45 minutes: Monday to Friday 8:30h – 12:00h
  • 4 practical computer tutorials in the afternoons (13:30h – 16:30h) on Monday, Tuesday, Thursday and Friday

Visit the course’s website at: http://www.huber.embl.de/csama

Postdoc, PhD and internship positions

We are continually inviting applications for postdoc, PhD and internship positions. You can apply for one of two tracks:

  1. Method development in statistical computing and bioinformatics,
  2. Biological discovery through integrative data analysis (“dry biology”)

For track 1, you will have strong quantitative and analytical skills, such as acquired through a degree in mathematics, statistics, physics, computer science or a related field. You have curiosity and motivation to work in interdisciplinary projects, which include generation of new data and their analysis, and are eager to get to grips with relevant areas of biology and the technologies used in biology research. You will have experience in scientific computing and be familiar with one or several computer languages. Familiarity with R is definitively a plus.

For track 2, you will have a training in life sciences and strong coding skills that enable you to undertake complex data transformations, integrative operations, applications of mathematical models and visualizations. You are driven by making fundamental discoveries by mining cutting-edge, large data sets.

To apply, please contact Wolfgang with your CV, a brief statement of research interests, and examples of your work: besides your publications, this can include theses, research reports, talk slides, software projects (e.g. R packages, github projects) or data analysis reports (e.g. markdown reports or Jupyter notebooks).

Here are some keywords and a non-exhaustive list of collaboration partners with whom we work frequently on new, exciting data types:

  • Latent spaces and manifolds estimation from multi-modal single cell data
  • Genotype-drug interactions, precision oncology, multivariate biomarker discovery
  • Imaging-based phenotyping
  • Bioconductor
  • Thorsten Zenz – pharmacogenomics of drug response in blood cancer
  • Sascha Dietrich – systems medicine of cancer drugs
  • Lars Steinmetz – systems genetics & ‘omics technology development
  • Michael Boutros – high-throughput genetics, genetic interactions & synthetic lethality in cancer
  • Henrik Kaessmann – evolution of cell types

 

New paper: Drug-perturbation-based stratification of blood cancer

Abstract: As new generations of targeted therapies emerge and tumor genome sequencing discovers increasingly comprehensive mutation repertoires, the functional relationships of mutations to tumor phenotypes remain largely unknown. Here, we measured ex vivo sensitivity of 246 blood cancers to 63 drugs alongside genome, transcriptome, and DNA methylome analysis to understand determinants of drug response. We assembled a primary blood cancer cell encyclopedia data set that revealed disease-specific sensitivities for each cancer. Within chronic lymphocytic leukemia (CLL), responses to 62% of drugs were associated with 2 or more mutations, and linked the B cell receptor (BCR) pathway to trisomy 12, an important driver of CLL. Based on drug responses, the disease could be organized into phenotypic subgroups characterized by exploitable dependencies on BCR, mTOR, or MEK signaling and associated with mutations, gene expression, and DNA methylation. Fourteen percent of CLLs were driven by mTOR signaling in a non–BCR-dependent manner. Multivariate modeling revealed immunoglobulin heavy chain variable gene (IGHV) mutation status and trisomy 12 as the most important modulators of response to kinase inhibitors in CLL. Ex vivo drug responses were associated with outcome. This study overcomes the perception that most mutations do not influence drug response of cancer, and points to an updated approach to understanding tumor biology, with implications for biomarker discovery and cancer care.

Read more

Welcome Holly Giles

Holly graduated from the University of Cambridge with a degree in Natural Sciences. For her thesis, she investigated the within-host diversity of influenza infections using statistical methods. In 2016, she undertook an internship at the Francis Crick Institute, London, during which she investigated potential new methods of influenza surveillance and vaccine research.

Holly joined the Huber group in September 2017 for a PhD, where her work focuses on using multi-omic data to understand drug responses in leukaemia patients. She is working jointly with the Dietrich group at the National Centre for Tumour Diseases, Heidelberg, performing experiments and gaining clinical insight to support her bioinformatic analysis.

Welcome Cécile Le Sueur

Cécile is a master student in Computational Science at EPFL (Switzerland). She joined the Huber group for an internship in August 2017 and will work with Dorothee and Arne on proteomics data to study drug (off-)targets and systems-wide effects of drug treatment.

The Huber Group on the 6th NCT Run

The race event of the National Center for Tumor Diseases (NCT) Heidelberg took place for the 6th time last Friday, 7 July 2017. More than 4.500 participants including patients and their families, physichians, scientists and friends of the NCT run “to opose cancer with a positive note” joined the event this year. An EMBL team supports this initiative every year.
See more (in German)

NCT Run - Huber Group

From the left to the right: Simone Bell, Olena Yavorska, Vladislav Kim, Frederik Ziebell, Almut Lütge and Britta Velten from the Huber Group.

EMBL Charity Cycle Challenge

On 30 June 2017 a group of 14 EMBL cyclists set off on an epic ride from Heidelberg to EMBL Grenoble over five days. Their goal is to raise money for the Kinderplanet, a charity that supports the families of children treated at the Heidelberg University Hospital. As the Kindergarten relies solely on donations in order to operate, their mission is to help Kinderplanet in supporting families of sick children.

30 June – 4 July 2017
5 days – 850 km distance – 16.500 m ascent
By completing this physically and mentally demanding cycling challenge they hope to spread the word about Kinderplanet charity and encourage our colleagues, friends and family members to donate for a great cause.

Almut Lütge and Mike Smith from the Huber Group are taking part in this epic ride!

1

2

3

4

5

6

CSAMA 2017 – Statistical Data Analysis for Genome Scale Biology

CSAMA 2017 (15th edition)
Statistical Data Analysis for Genome Scale Biology
Bressanone-Brixen, Italy (South Tyrol Alps)
June 11-16, 2017

Lecturers:

  • Jennifer Bryan, RStudio and UBC
  • Vincent J. Carey, Harvard Medical School
  • Laurent Gatto, University of Cambridge
  • Wolfgang Huber, European Molecular Biology Laboratory (EMBL), Heidelberg
  • Martin Morgan, Roswell Park Cancer Institute, Buffalo
  • Johannes Rainer, European Academy of Bozen (EURAC)
  • Charlotte Soneson, University of Zurich
  • Levi Waldron, CUNY School of Public Health at Hunter College, New York

Teaching Assistants:

  • Simone Bell, EMBL, Heidelberg
  • Vladislav Kim, EMBL, Heidelberg
  • Lori Shepherd, RPCI, Buffalo
  • Mike L. Smith, EMBL, Heidelberg

The one-week intensive course Statistical Data Analysis for Genome-Scale Biology teaches statistical and computational analysis of multi-omics studies in biology and biomedicine. It covers the underlying theory and state of the art (the morning lectures) and practical hands-on exercises based on the R / Bioconductor environment (the afternoon labs). At the end of the course, you should be able to run analysis workflows on your own (multi-)omic data, adapt and combine different tools, and make informed and scientifically sound choices about analysis strategies.

Topics include:

  • Introduction to R and Bioconductor
  • The elements of statistics: hypothesis testing, multiple testing, regression, regularization, clustering and classification, parallelization and performance (machine learning), visualisation
  • RNA-Seq data analysis
  • Computing with sequences and genomic intervals
  • Working with annotation – genes, genomic features, variants, transcripts and proteins
  • Gene set enrichment analysis
  • Mass spec proteomics and metabolomics
  • Basis of microbiome analysis
  • Experimental design, batch effects and confounding
  • Reproducible research and workflow authoring with R markdown
  • Package development, version control and developer tools (incl. git, github, RStudio)
  • Working with large data: performance parallelisation and cloud computing

The course consists of

  • morning lectures: 20 x 45 minutes: Monday to Friday 8:30h – 12:00h
  • 4 practical computer tutorials in the afternoons (13:30h – 16:30h) on Monday, Tuesday, Thursday and Friday

Visit the course’s website at: http://www.huber.embl.de/csama

Welcome Frederik Ziebell

Frederik has a PhD in Applied Mathematics from Heidelberg University and the German Cancer Research Center (DKFZ). He works on developing statistical methods for high-dimensional heterogeneous data and the analysis of multi-omic level drug treatment effects in collaboration with Cellzome.

Welcome Almut Lütge

Almut is a master student in molecular biotechnology at the University of Heidelberg. She joined the Huber group in 2017 and works on the analysis of RNA-seq data from Chronic lymphatic leukemia (CLL) samples.