# Load and attach PRONE
library(PRONE)
Here, we are directly working with the SummarizedExperiment data. For more information on how to create the SummarizedExperiment from a proteomics data set, please refer to the “Get Started” vignette.
The example TMT data set originates from (Biadglegne et al. 2022).
data("tuberculosis_TMT_se")
se <- tuberculosis_TMT_se
This SummarizedExperiment object already includes data of different normalization methods. Since this vignette should show you how to use the PRONE workflow for novel proteomics data, we will remove the normalized data and only keep the raw and log2 data that are available after loading the data accordingly.
se <- subset_SE_by_norm(se, ain = c("raw", "log2"))
To get an overview on the number of NAs, you can simply use the function get_NA_overview()
:
knitr::kable(get_NA_overview(se, ain = "log2"))
Total.Values | NA.Values | NA.Percentage |
---|---|---|
6020 | 1945 | 32.30897 |
To get an overview on the number of samples per sample group or batch, you can simply use the function plot_condition_overview()
by specifying the column of the meta-data that should be used for coloring. By default (condition = NULL), the column specified in load_data()
will be used.
plot_condition_overview(se, condition = NULL)
#> Condition of SummarizedExperiment used!
plot_condition_overview(se, condition = "Pool")
A general overview of the protein intensities across the different samples is provided by the function plot_heatmap()
. The parameter “ain” specifies the data to plot, currently only “raw” and “log2” is available (names(assays(se)). Later if multiple normalization methods are executed, these will be saved as assays, and the normalized data can be visualized.
available_ains <- names(assays(se))
plot_heatmap(se, ain = "log2", color_by = c("Pool", "Group"),
label_by = NULL, only_refs = FALSE)
#> Label of SummarizedExperiment used!
#> $log2
Similarly, an upset plot can be generated to visualize the overlaps between sets defined by a specific column in the metadata. The sets are generated by using non-NA values.
plot_upset(se, color_by = NULL, label_by = NULL, mb.ratio = c(0.7,0.3),
only_refs = FALSE)
#> Condition of SummarizedExperiment used!
#> Label of SummarizedExperiment used!
If you are interested in the intensities of specific biomarkers, you can use the plot_markers_boxplots()
function to compare the distribution of intensities per group. The plot can be generated per marker and facet by normalization method (facet_norm = TRUE) or by normalization method and facet by marker (facet_marker = TRUE).
p <- plot_markers_boxplots(se,
markers = c("Q92954;J3KP74;E9PLR3", "Q9Y6Z7"),
ain = "log2",
id_column = "Protein.IDs",
facet_norm = FALSE,
facet_marker = TRUE)
#> Condition of SummarizedExperiment used!
#> No shaping done.
p[[1]] + ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5))