This vignette describes how to pefrom quality control for mass-spectrometry based hydrogen deuterium exchange experiment.
hdxmsqc 1.0.1
The hdxmsqc
package is a quality control assessment package
from hydrogen-deuterium exchange mass-spectrometry (HDX-MS) data. The functions
look for outliers in retention time and ion mobility. They also examine missing
values, mass errors, intensity based outliers, deviations of the data from
monotonicity, the correlation of charge states, whether uptake values
are coherent based on overlapping peptides and finally the similarity of the
observed to the theoretical spectra observed. This package is designed
to help those performing iterative quality control through manual inspection
but also a set of metric and visualizations by which practitioners can use
to demonstrate they have high quality data.
The packages required are the following.
suppressMessages(require(hdxmsqc))
require(S4Vectors)
suppressMessages(require(dplyr))
require(tidyr)
## Loading required package: tidyr
##
## Attaching package: 'tidyr'
## The following object is masked from 'package:S4Vectors':
##
## expand
require(QFeatures)
require(RColorBrewer)
## Loading required package: RColorBrewer
require(ggplot2)
## Loading required package: ggplot2
require(MASS)
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
require(pheatmap)
## Loading required package: pheatmap
require(Spectra)
require(patchwork)
## Loading required package: patchwork
##
## Attaching package: 'patchwork'
## The following object is masked from 'package:MASS':
##
## area
We first load the data, as exported from HDExaminer.
BRD4uncurated <- data.frame(read.csv(system.file("extdata", "ELN55049_AllResultsTables_Uncurated.csv", package = "hdxmsqc", mustWork = TRUE)))
The following code chunk tidies dataset, which improves the formatting and converts to wide format. It will also note the number of states, timepoints and peptides.
BRD4uncurated_wide <- processHDE(HDExaminerFile = BRD4uncurated,
proteinStates = c("wt", "iBET"))
## Number of peptide sequence: 167
## Number of timepoints: 7
## Number of Protein States: 2
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
## Warning in FUN(X[[i]], ...): NAs introduced by coercion
The next code chunk extracts the columns with the quantitative data.
i <- grep(pattern = "X..Deut",
x = names(BRD4uncurated_wide))
We now parse the object into an object of class Qfeatures
. This standardises
the formatting of the data.
BRD4df <- readQFeatures(assayData = BRD4uncurated_wide,
ecol = i,
names = "Deuteration",
fnames = "fnames")
## Checking arguments.
## Warning in .checkWarnEcol(quantCols, ecol): 'ecol' is deprecated, use
## 'quantCols' instead.
## Loading data as a 'SummarizedExperiment' object.
## Formatting sample annotations (colData).
## Formatting data as a 'QFeatures' object.
A simple heatmap of our data can give us a sense of it.
pheatmap(assay(BRD4df), cluster_rows = FALSE, scale = "row")