The data visualization cheat sheet will be very useful for today’s exercises.
Look at the data for eruptions of the “old faithful” volcano:
str(faithful)
## 'data.frame': 272 obs. of 2 variables:
## $ eruptions: num 3.6 1.8 3.33 2.28 4.53 ...
## $ waiting : num 79 54 74 62 85 55 88 85 51 85 ...
geom_smooth
to fit a straight line through the scatter plot.Load the penguins data that we have seen in the demo:
library(palmerpenguins)
str(penguins)
## tibble[,8] [344 × 8] (S3: tbl_df/tbl/data.frame)
## $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ island : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ bill_length_mm : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
## $ bill_depth_mm : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
## $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
## $ body_mass_g : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
## $ sex : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
## $ year : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...
We are going to look at the airway
data set. This data set contains read counts per gene for airway smooth muscle cell lines in an RNA-Seq experiment. The aim of this experiment was to compare the treatment with dex
(dexamethasone, a drug which is used to treat asthma) to a control (no treatment). Suppose we want to check the quality of our data. We want to test whether the genes show reproducible expression behavior in two replicates of the same condition.
The following code assigns the read counts of all tested genes in two replicates of dex treatment to two vectors, rep1
and rep2
:
library(airway)
data("airway")
my_data <- assay(airway)
rep1 <- my_data[,2]
rep2 <- my_data[,4]
Load the following gene expression data:
library(dslabs)
data("tissue_gene_expression")
tissue <- tissue_gene_expression$y
expression <- tissue_gene_expression$x
tissue
and expression
objects. What are the rows and columns in both objects? How are the objects connected to each other?data.frame
with two columns: the tissue and expression values for all measurements on the gene “FLI1”.ggbeeswarm::geom_beeswarm
do and can it be useful for your plot?