Last updated: 2024-05-24

Checks: 4 2

Knit directory: RA_Tcell_omics/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20221110) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

The following chunks had caches available:
  • unnamed-chunk-20
  • unnamed-chunk-22
  • unnamed-chunk-24
  • unnamed-chunk-26
  • unnamed-chunk-28
  • unnamed-chunk-30
  • unnamed-chunk-32
  • unnamed-chunk-34
  • unnamed-chunk-36
  • unnamed-chunk-38
  • unnamed-chunk-40
  • unnamed-chunk-42

To ensure reproducibility of the results, delete the cache directory Keto_diet_analysis_cache and re-run the analysis. To have workflowr automatically delete the cache directory prior to building the file, set delete_cache = TRUE when running wflow_build() or wflow_publish().

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Tracking code development and connecting the code version to the results is critical for reproducibility. To start using Git, open the Terminal and type git init in your project directory.


This project is not being versioned with Git. To obtain the full reproducibility benefits of using workflowr, please see ?wflow_start.


Load libraries

Global variables

Load and preprocess datasets

Load omics data

Remove CpGs from Y chromosomes

Filter for interested CpGs

To increase statistic power by reducing number of test and increase intepretability

CpGs that associated with interested genes

CpGs within known enhancers

Find GC probs in the enhancer region

Subset CpGs that in either of the two list

Final data dimension

[1] 82147    40

Prepare annotation table for CpGs

PCA

Without adjust for patient specific effect

Calculate PCA

PCA plots

PC1 versus PC2

PC3 versus PC4

PCA with adjust for patient effect

Use combat to remove patient specific effect

Calculate PCA

PCA plots

PC1 versus PC2

After adjust for patient specific effect, the time point effect can be clearly seen as PC1

PC3 versus PC4

Differential methylation analysis to identify methylation change over time

Sample size

, ,  = ns

             
              d0 d12
  Disease_CNT  3   3
  RA           7   7

, ,  = st

             
              d0 d12
  Disease_CNT  3   3
  RA           7   7

For comparisons in RA: 7 versus 7, for comparisons disease CNT: 3 versus 3. The statistical power could be a problem

In patient with RA

Non-stimulated samples

Subset

Perform differential methylation using limma

Stimulated samples

Subset

Perform differential methylation using limma

Stimulated versus non-stimulated at day 0

Subset

Perform differential methylation using limma

Stimulated versus non-stimulated at day 12

Subset

Perform differential methylation using limma

Block for time or stimulation or test for their interaction

Subset

Perform differential methylation using limma

In patient with disease control

Non-stimulated samples

Subset

Perform differential methylation using limma

Stimulated samples

Subset

Perform differential methylation using limma

Stimulated versus non-stimulated at day 0

Subset

Perform differential methylation using limma

Stimulated versus non-stimulated at day 12

Subset

Perform differential methylation using limma

Block for time or stimulation or test for their interaction

Subset

Perform differential methylation using limma

Compare RA and CNT using mixed-effect model with interaction

Without stimulation

Subset

Perform differential methylation using limma

With stimulation

Subset

Perform differential methylation using limma

Summarise the comparison results

P-value histogram

If the p-value histogram has a peak on the left, it indicates there’s more like a real difference. If the histogram is largely flat, perhaps the difference is not strong

Number of significant associations with raw P.Value < 0.01

Compare the overlap of differential methylation CpGs with Venn diagram

In RA samples, d12 versus d0, compare with and without stimulation

In control samples, d12 versus d0, compare with and without stimulation

d12 versus d0, without stimulation, compare RA and CNT

d12 versus d0, with stimulation, compare RA and CNT

Scatter plot for comparison

In RA samples, d12 versus d0, compare with and without stimulation

In control samples, d12 versus d0, compare with and without stimulation

d12 versus d0, without stimulation, compare RA and CNT

d12 versus d0, with stimulation, compare RA and CNT

Visualize CpGs show interaction with disease group

No stimulation

In the above plots, the first line in the title is the mehtylation CpG id, the second line is the gene and gene region annotations from Epic methylation array, the third line is the annotations from GeneEnhancer. The number in the parenthesis is the enhancer score, the higher the more confident.

With stimulation