Archives

  • 2018-07
  • 2018-10
  • 2018-11
  • 2019-04
  • 2019-05
  • 2019-06
  • 2019-07
  • 2019-08
  • 2019-09
  • 2019-10
  • 2019-11
  • 2019-12
  • 2020-01
  • 2020-02
  • 2020-03
  • 2020-04
  • 2020-05
  • 2020-06
  • 2020-07
  • 2020-08
  • 2020-09
  • 2020-10
  • 2020-11
  • 2020-12
  • 2021-01
  • 2021-02
  • 2021-03
  • 2021-04
  • 2021-05
  • 2021-06
  • 2021-07
  • 2021-08
  • 2021-09
  • 2021-10
  • 2021-11
  • 2021-12
  • 2022-01
  • 2022-02
  • 2022-03
  • 2022-04
  • 2022-05
  • 2022-06
  • 2022-07
  • 2022-08
  • 2022-09
  • 2022-10
  • 2022-11
  • 2022-12
  • 2023-01
  • 2023-02
  • 2023-03
  • 2023-04
  • 2023-05
  • 2023-06
  • 2023-08
  • 2023-09
  • 2023-10
  • 2023-11
  • 2023-12
  • 2024-01
  • 2024-02
  • 2024-03
  • endothelin receptor antagonist br Acknowledgements br Value

    2018-10-23


    Acknowledgements
    Value of the data We provide a standard proteomic dataset based on a highly complex sample (yeast lysate) spiked with different levels of a second calibrated protein mixture of medium complexity (UPS1 standard, 48 proteins), that can be used to statistically evaluate label-free approaches for detection of differentially abundant proteins.
    Data and experimental design We provide a dataset composed of raw MS files corresponding to the analysis of a series of yeast cell lysate samples spiked with different amounts of an equimolar mixture of 48 recombinant proteins (Sigma UPS1). The 9 different samples were analyzed in triplicate on a LTQ-Velos Orbitrap and the resulting 27 raw files can be downloaded from ProteomeXchange using the identifier http://www.ebi.ac.uk/pride/archive/projects/PXD001819. The spiked UPS1 proteins can be easily identified after database search and constitute the “ground truth” panel of differentially abundant proteins in quantitative pairwise comparison of samples from the dataset. Conversely, the background of yeast proteins should remain invariant after quantitative comparison of these samples. As UPS1 proteins constitute a very minor proportion of the global proteome for each sample (see Fig. 1 showing the histogram of iBAQ values for yeast background an UPS1 proteins respectively in the different samples), the samples can in principle be used to simulate a biological situation where only a minor part of the protein population undergoes endothelin receptor antagonist changes, and data can be normalized based on the median of intensity values for the global protein population. Nevertheless, the spiked mixture is relatively complex (48 proteins) and can be used to statistically approximate the sensitivity of the analytical and bioinformatics workflows, by calculating the proportion of UPS1 proteins truly found as showing differential endothelin receptor antagonist signals. As a proof of principle of the potential utility of this dataset, we also provide some processed data, in Excel format (Supplementary Table 1), obtained from different bioinformatics workflows combining tools for database search, protein validation, and label-free quantification. These workflows are described in details in [1] and in the Materials and Methods section below, and include MFPaQ [2,3], Irma/Heidi, Scaffold, MaxQuant [4–6] and Skyline [7,8] as tools to extract quantitative metrics. The quantitative value extracted for each protein is either a total spectral count obtained after the validation step, or a protein intensity value obtained from the MS signal intensity of associated peptides. The Excel file provided here (Supplementary Table 1) is composed of 8 different sheets containing the quantitative data from the 8 different workflows tested (4 spectral count and 4 MS intensity based workflows). The experimental design of the data processing is illustrated in Fig. 2. Among the 9 spiked samples analyzed, we selected 5 of them in order to perform different pairwise quantitative comparison of samples, trying to mimic distinct biochemical situations (comparisons A, B and C: detection in only one condition; high fold change; moderate fold change). The quantitative outputs obtained from these pairwise comparisons were then combined, in order to reconstruct a simulated dataset containing true-positive hits with different intensities and fold change values, and to illustrate the performances of the bioinformatic and statistical methods in a more comprehensive way. Each Excel sheet in Supplementary Table 1 corresponds to this mixed dataset, for a particular bioinformatic workflow.
    Materials and methods Sample preparation. A yeast cell lysate was prepared in 8M urea/0.1M ammonium bicarbonate buffer, protein concentration was adjusted at 8µg/µL after Bradford assay, and this lysate was used to resuspend and perform a serial dilution of the UPS1 standard mixture (Sigma). Twenty µL of each of the resulting samples, corresponding to 9 different spiked levels of UPS1 (respectively 0.05–0.125–0.250–0.5–2.5–5–12.5–25–50fmol of UPS1/µg of yeast lysate), were reduced with DTT and alkylated with iodoacetamide. The urea concentration was lowered to 1M by dilution, and proteins were digested in solution by addition of 2% of trypsin overnight. Enzymatic digestion was stopped by addition of TFA (0.5% final concentration).