Load the mice data using the following commands:
mice_pheno <- read.csv2(file= url("https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/mice_pheno.csv"), sep=",") mice_pheno$Bodyweight <- as.numeric(mice_pheno$Bodyweight)
Consider the ELISA exercise from day 2 (example from MSMB).
Suppose now we have a known false positive rate of 1%. This is the probability of declaring a hit – we think we have an epitope – when there is none.
Can you think of a test that answers the question: What is the probability of seeing counts as large as 7 (7 out of the 50 patients had a hit), when there really is no epitope at this position?
What is problematic about reporting the obtained p-value for the epitope detected at position 42? Discuss in your breakout room.
This exercise is adapted from Bernd Klaus’ teaching materials.
The ALL data consist of microarrays from 128 different individuals with acute lymphoblastic leukemia (ALL). There are 95 samples with B-cell ALL and 33 with T-cell ALL and because these are different tissues and quite different diseases we consider them separately and focus on the B-cell ALL tumors. An interesting subset, with two groups having approximately the same number of samples in each group, is the comparison of the B-cell tumors found to carry the BCR/ABL mutation to those B-cell tumors with no observed cytogenetic abnormalities. These samples are labeled BCR/ABL and NEG in the
mol.biol variable. The BCR/ABL mutation, also known as the Philadelphia chromosome, was the first cytogenetic aberration that could be associated with the development of cancer, leading the way to the current understanding of the disease. In tumors harboring the BCR/ABL translocation a short piece of chromosome 22 is exchanged with a segment of chromosome 9. As a consequence, a constitutively active fusion protein is transcribed which acts as a potent mitogene, leading to uncontrolled cell division. Not all leukemia tumors carry the Philadelphia chromosome; there are other mutations that can be responsible for neoplastic alterations of blood cells, for instance a translocation between chromosomes 4 and 11 (ALL1/AF4).
In this exercise we want to test whether the expression of BCL2 gene differs between samples with and without BCR/ABL fusion in B-cells.
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("ALL")
The following code loads the data and subsets them:
library(ALL) data(ALL) # Subset to B-cells my_ALL <- ALL[, substr(ALL$BT,1,1) =="B"] # only consider BRC/ABL and NEG cases my_ALL <- my_ALL[, my_ALL$mol.biol %in% c("BCR/ABL", "NEG")] # turn molecular biology into a factor my_ALL$mol.biol <- as.factor(my_ALL$mol.biol) # These are the expression values: expr_data <- exprs(my_ALL) # extract expression values of the BCL2 gene (line 1152) for the two groups: neg <- expr_data[1152,my_ALL$mol.biol == "NEG"] bcrabl <- expr_data[1152,my_ALL$mol.biol == "BCR/ABL"]
The two vectors
brcabl contain the BCL2 expression values of the negative and BRC/ABL tumors, respectively.
t.testperforms a Welch test. What does that mean and should you change this default option for this particular example?
This example is from a genomics lecture by Rafael Irizarry.
Get yourself an impression how t-test and Wilcoxon test cope with outliers:
ywith 25 standard-normally distributed data points each.
yhave different means? Confirm with a t-test and Wilcoxon test.
xwith the values 5 and 7.