Thomas Naake and Wolfgang Huber
First-line data quality assessment and exploratory data analysis are integral parts of any data analysis workflow. In high-throughput quantitative omics experiments (e.g. transcriptomics, proteomics and metabolomics), after initial processing, the data are typically presented as a matrix of numbers (feature IDs × samples). Efficient and standardized data quality metrics calculation and visualization are key to track the within-experiment quality of these rectangular data types and to guarantee for high-quality datasets and subsequent biological question-driven inference.
We present MatrixQCvis, which provides interactive visualization of data quality metrics at the per-sample and per-feature level using R’s shiny framework. It provides efficient and standardized ways to analyze data quality of quantitative omics data types that come in a matrix-like format (features IDs × samples). MatrixQCvis builds upon the Bioconductor SummarizedExperiment S4 class and thus facilitates the integration into existing workflows.
MatrixQCVis is implemented in R. It is available via Bioconductor and released under the GPL v3.0 license.
Dorothee Childs, Karsten Bach, Holger Franken, Simon Anders, Nils Kurzawa, Marcus Bantscheff, Mikhail Savitski and Wolfgang Huber
Detecting the targets of drugs and other molecules in intact cellular contexts is a major objective in drug discovery and in biology more broadly. Thermal proteome profiling (TPP) pursues this aim at proteome-wide scale by inferring target engagement from its effects on temperature-dependent protein denaturation. However, a key challenge of TPP is the statistical analysis of the measured melting curves with controlled false discovery rates at high proteome coverage and detection power. We present non-parametric analysis of response curves (NPARC), a statistical method for TPP based on functional data analysis and nonlinear regression. We evaluate NPARC on five independent TPP datasets and observe that it is able to detect subtle changes in any region of the melting curves, reliably detects the known targets, and outperforms a melting point-centric, single-parameter fitting approach in terms of specificity and sensitivity. NPARC can be combined with established analysis of variance (ANOVA) statistics and enables flexible, factorial experimental designs and replication levels. To facilitate access to a wide range of users, a freely available software implementation of NPARC is provided.
Arne H. Smits, Frederik Ziebell, Gerard Joberty, …, Lars M. Steinmetz, Gerard Drewes and Wolfgang Huber.
Gene knockouts (KOs) are efficiently engineered through CRISPR-Cas9-induced frameshift mutations. While DNA editing efficiency is readily verified by DNA sequencing, a systematic understanding of the efficiency of protein elimination has been lacking. Here, we devised an experimental strategy combining RNA-seq and triple-stage mass spectrometry to characterize 193 genetically verified deletions targeting 136 distinct genes generated by CRISPR-induced frameshifts in HAP1 cells. We observed residual protein expression for about one third of the quantified targets, at variable levels from low to original, and identified two causal mechanisms, translation reinitiation leading to N-terminally truncated target proteins, or skipping of the edited exon leading to protein isoforms with internal sequence deletions. Detailed analysis of three truncated targets, BRD4, DNMT1 and NGLY1, revealed partial preservation of protein function. Our results imply that systematic characterization of residual protein expression or function in CRISPR-Cas9 generated KO lines is necessary for phenotype interpretation.
The genome of pluripotent stem cells adopts a unique three-dimensional architecture featuring weakly condensed heterochromatin and large nucleosome-free regions. Yet, it is unknown whether structural loops and contact domains display characteristics that distinguish embryonic stem cells (ESCs) from differentiated cell types. We used genome-wide chromosome conformation capture and super-resolution imaging to determine nuclear organization in mouse ESC and neural stem cell (NSC) derivatives. We found that loss of pluripotency is accompanied by widespread gain of structural loops. This general architectural change correlates with enhanced binding of CTCF and cohesins and more pronounced insulation of contacts across chromatin boundaries in lineage-committed cells. Reprogramming NSCs to pluripotency restores the unique features of ESC domain topology. Domains defined by the anchors of loops established upon differentiation are enriched for developmental genes. Chromatin loop formation is a pervasive structural alteration to the genome that accompanies exit from pluripotency and delineates the spatial segregation of developmentally regulated genes.
Abstract: As new generations of targeted therapies emerge and tumor genome sequencing discovers increasingly comprehensive mutation repertoires, the functional relationships of mutations to tumor phenotypes remain largely unknown. Here, we measured ex vivo sensitivity of 246 blood cancers to 63 drugs alongside genome, transcriptome, and DNA methylome analysis to understand determinants of drug response. We assembled a primary blood cancer cell encyclopedia data set that revealed disease-specific sensitivities for each cancer. Within chronic lymphocytic leukemia (CLL), responses to 62% of drugs were associated with 2 or more mutations, and linked the B cell receptor (BCR) pathway to trisomy 12, an important driver of CLL. Based on drug responses, the disease could be organized into phenotypic subgroups characterized by exploitable dependencies on BCR, mTOR, or MEK signaling and associated with mutations, gene expression, and DNA methylation. Fourteen percent of CLLs were driven by mTOR signaling in a non–BCR-dependent manner. Multivariate modeling revealed immunoglobulin heavy chain variable gene (IGHV) mutation status and trisomy 12 as the most important modulators of response to kinase inhibitors in CLL. Ex vivo drug responses were associated with outcome. This study overcomes the perception that most mutations do not influence drug response of cancer, and points to an updated approach to understanding tumor biology, with implications for biomarker discovery and cancer care.
Abstract: Hypothesis weighting improves the power of large-scale multiple testing. We describe independent hypothesis weighting (IHW), a method that assigns weights using covariates independent of the P-values under the null hypothesis but informative of each test’s power or prior probability of the null hypothesis: www.bioconductor.org/packages/IHW. IHW increases power while controlling the false discovery rate and is a practical approach to discovering associations in genomics, high-throughput biology and other large data sets.
Abstract: We extended thermal proteome profiling to detect transmembrane protein–small molecule interactions in cultured human cells. When we assessed the effects of detergents on ATP-binding profiles, we observed shifts in denaturation temperature for ATP-binding transmembrane proteins. We also observed cellular thermal shifts in pervanadate-induced T cell–receptor signaling, delineating the membrane target CD45 and components of the downstream pathway, and with drugs affecting the transmembrane transporters ATP1A1 and MDR1.
Abstract: Studies on signalling dynamics in living embryos have been limited by a scarcity of in vivo reporters. Tandem fluorescent protein timers provide a generic method for detecting changes in protein population age and thus provide readouts for signalling events that lead to changes in protein stability or location. When imaged with quantitative dual-colour fluorescence microscopy, tandem timers offer detailed ‘snapshot’ readouts of signalling activity from subcellular to organismal scales, and therefore have the potential to revolutionize studies in developing embryos. Here we use computer modelling and embryo experiments to explore the behaviour of tandem timers in developing systems. We present a mathematical model of timer kinetics and provide software tools that will allow experimentalists to select the most appropriate timer designs for their biological question, and guide interpretation of the obtained readouts. Through the generation of a series of novel zebrafish reporter lines, we confirm experimentally that our quantitative model can accurately predict different timer responses in developing embryos and explain some less expected findings. For example, increasing the FRET efficiency of a tandem timer actually increases the ability of the timer to detect differences in protein half-life. Finally, while previous studies have used timers to monitor changes in protein turnover, our model shows that timers can also be used to facilitate the monitoring of gene expression kinetics in vivo.
Summary: Morphogenesis of multicellular organisms is driven by localized cell shape changes. How, and to what extent, changes in behavior in single cells or groups of cells influence neighboring cells and large-scale tissue remodeling remains an open question. Indeed, our understanding of multicellular dynamics is limited by the lack of methods allowing the modulation of cell behavior with high spatiotemporal precision. Here, we developed an optogenetic approach to achieve local modulation of cell contractility and used it to control morphogenetic movements during Drosophila embryogenesis. We show that local inhibition of apical constriction is sufficient to cause a global arrest of mesoderm invagination. By varying the spatial pattern of inhibition during invagination, we further demonstrate that coordinated contractile behavior responds to local tissue geometrical constraints. Together, these results show the efficacy of this optogenetic approach to dissect the interplay between cell-cell interaction, force transmission, and tissue geometry during complex morphogenetic processes.
Our first Bioconductor workflow as a (to be) peer-reviewed article is out on F1000 Research! Download here
Additional to HTML and PDF, the workflow is now also rendered in Jupyter notebook format. Watch the video prepared by Vladislav Kim here.