Registration open: CSAMA 2015 (13th ed.) Statistics and Computing in Genome Data Science

CSAMA 2015 (13th edition)
Statistics and Computing in Genome Data Science
Bressanone-Brixen, Italy (South Tyrol Alps)
June 14-19, 2015

Registration for CSAMA 2015 is now open


  • Martin Morgan, Fred Hutchinson Cancer Research Center (USA)
  • Wolfgang Huber, European Molecular Biology Laboratory
  • Vincent J. Carey, Channing Laboratory, Harvard Medical School (USA)
  • Michael Love, Dana Farber Cancer Institute and the Harvard School of Public Health (USA)
  • Simon Anders, European Molecular Biology Laboratory
  • Mark Robinson, University of Zurich (CH)
  • Laurent Gatto, University of Cambridge (UK)
  • Paul Pyl, University of Copenhagen (DK)

The one-week intensive course “Statistics and Computing in Genome Data Science” teaches statistical and computational analysis of multi-omics studies in biology and biomedicine. It covers the underlying theory and state of the art (the morning lectures), and practical hands-on exercises based on the R / Bioconductor environment (the afternoon labs). The course covers the primary analysis (“preprocessing”) of high-throughput sequencing based assays in functional genomics (transcriptomics, epigenetics, etc.) as well as integrative methods including efficiently operating with genomic intervals, statistical testing, linear models, machine learning, bioinformatic annotation and visualization. At the end of the course, you should be able to run analysis workflows on your own (multi-)omic data, adapt and combine different tools, and make informed and scientifically sound choices about analysis strategies.

Topics include:

  • Introduction to Bioconductor
  • Elements of statistics: hypothesis testing, multiple testing, regression, regularization, clustering and classification (machine learning), visualization
  • Computing with sequences and genomic intervals
  • RNA-Seq data analysis and differential expression
  • ChIP-Seq and epigenetics
  • Integrating DNA variant calls with functional data, and large-scale efficient computation with genomic intervals
  • Working with annotation – genes, genomic features and variants
  • Metagenomics and proteomics primers
  • Single-cell RNA-Seq primer
  • Interactive displays with Shiny

The course consists of

  • morning lectures: 20 x 45 minutes: Monday to Friday 8:30am – 12:00am
  • 4 practical computer tutorials in the afternoons (2pm – 5pm) on Monday, Tuesday,Thursday and Friday

For registration visit our website at:

Bioconductor project – Perspective paper

The Perspective paper Orchestrating high-throughput genomic analysis with Bioconductor is addressed at users and prospective developers. It gives an overview over the collaborative software development and delivery model of the Bioconductor project. At Readcube:

Abstract: Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors.

DESeq2 paper published

We are happy to announce our recent paper by Michael I Love, Wolfgang Huber and Simon Anders: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, Genome Biology, 15:550 (2014).

Abstract: comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at Bioconductor.

Advanced Course: R programming and development

On 15-16 January 2015 we are hosting the Advanced Course: R programming and development.

The course will focus on two aspects of R programming and development.
The first part will introduce object-oriented programming using R’s S3 and S4 system and describe how to define classes, generic functions and methods. It will also present how to create Bioconductor-compliant R packages and document them. The second part will focus on various advanced topics in R programming such as unit testing, debugging, profiling and calling C/C++ code. It will also describe how to write efficient and elegant code using vectorisation, parallelization, and the functional programming paradigm. Finally, creating web applications with shiny will be discussed.

These topics will be illustrated using a small real-life bioinformatic case study. Participants will produce, at the end of the course, a fully fledged Bioconductor compliant R package.

European Bioconductor Developers’ Meeting

12th-13th January 2015, EMBL Heidelberg

We are proudly hosting this year’s European Bioconductor Developers’ Meeting.

The meeting is aimed at bioinformaticians, programmers and software engineers who contribute to the Bioconductor project, or are interested in developing packages for Bioconductor. The goals are to:

  • foster the exchange of technical expertise,
  • keep contributors up to date with the latest developments,
  • coordinate any related efforts.

For details see the the event’s website.