Last updated: 2024-05-17
Checks: 0 1
Knit directory:
SpinalCord_proteomics/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Tracking code development and connecting the code version to the
results is critical for reproducibility. To start using Git, open the
Terminal and type git init in your project directory.
This project is not being versioned with Git. To obtain the full
reproducibility benefits of using workflowr, please see
?wflow_start.
Section 1: Preprocessing and Quality
Control of the proteomics data
This analysis describes the pre-processing and quality control steps of
the CSF proteomic data. Overall, this dataset’s quality is good but
contains some potential technical noises (unwanted variations). The
unwanted variations (noise) can be reduced by using statistical
method.
Section 2:
Investigate the difference between healthy samples and samples with
injury (at baseline)
In this analysis, samples with injury (at visit 3) are compared with
healthy samples to identify proteins that are differentially expressed
between healthy and patients with injury. This analysis can be
considered as a biological QC or quality control of the proteomic data.
We should identify known markers for spinal cord injury in CSF
samples.
Section 3: Identify
proteins associated with random node group of baseline samples (Visit
3)
This analysis aims to identify proteins that are differentially
expressed between node group B (random node 10, 16, 17, 18) and node
group A (4, 5, 8, 9, 13) at baseline samples. This analysis may give
insight on how the two groups are different at CSF proteomic
level.
Section 4:
Time-series analysis on the treated group
This analysis aims to identify proteins whose expressions change over
time. This analysis also tries to identify proteins whose expressions
over time show different patterns in samples with better recovery (high
delta_UEMS) compared to worse recovery (low delta_UEMS) or random node
group B versus group A.
Section 5:
Time-series analysis on the placebo group
The same as section 4, but on the placebo group.
Section 6:
Time-series analysis compare treated versus placebo group
This analysis aims to identify proteins whose expression change over
time show different pattern in treated versus control group, in all
samples or samples from random node B. The results can help identify
treatment-specific effect. However, it seems only the drug NG101 is
clearly different. This may be reasonable as the direct treatment effect
is largely covered by the effect from recovery over time, which can also
be observed in untreated group.
Section 7: Identify
proteins that are associated with outcome, delta_UEMS
This analysis aims to identify proteins whose expression at different
time points or change between different time points (expression at visit
8 - expression at visit 3) are associated with outcome (delta_UEMS), in
either placebo or treated group. This analysis can identify candidates
for predictive multi-variate machine learning models. By comparing
treated and untreated group (at the end of section 7), it can also give
insight on potential drug-specific effect.
Section 8:
Identify proteins whose expression changes between visit 3 and visit 8
are directly associated with the UEMS change between visit 3 and visit
8
In this analysis, the protein expression changes between visit 8 and
visit 3 are correlated with the UEMS changes between visit 8 and visit
3. Maybe this analysis can help identify proteins that directly related
to recovery in either placebo or treated group.
Section 9: Build machine
learning model for predicting treatment outcome, delta_UEMS
In this analysis, different machine learning models are built to predict
outcome, delta_UEMS, using CSF proteomics. The proteomic model are also
compared to the clinical parameters to show that proteomic model can
better predict outcome than current clinical parameters or add
additional information to clinical parameters.
Section 10: Build machine
learning model for predicting treatment outcome, delta_UEMS, only in
random node B
The same analysis as section 9, but only for treated patients from node
group B.