Datasets Description

We used the experimentally verified SADPr data and pS data derived from six studies[1-6] and PhosphositePlus database. We compared both datasets and found 3250 pSADPr peptides, 147,977 pS peptides, and 4270 SADPr peptides.

To prepare the benchmark data sets with high confidence for training and testing, we procedure established : (1) In order to avoid over-estimation caused by similar protein sequences, Serine-modified peptides with >60% sequence identity clustered together and de-redundancy. (2) The representative proteins in the dataset was randomly divided into two groups: 10/11 for cross-validation and the rest 1/11 for independent test.
[1] Nowak, K. et al. Engineering Af1521 improves ADP-ribose binding and identification of ADP-ribosylated proteins. Nat Commun 11, 5199, doi:10.1038/s41467-020-18981-w (2020).
[2] Larsen, S. C., Hendriks, I. A., Lyon, D., Jensen, L. J. & Nielsen, M. L. Systems-wide Analysis of Serine ADP-Ribosylation Reveals Widespread Occurrence and Site-Specific Overlap with Phosphorylation. Cell Rep 24, 2493-2505 e2494, doi:10.1016/j.celrep.2018.07.083 (2018).
[3] Buch-Larsen, S. C. et al. Mapping Physiological ADP-Ribosylation Using Activated Ion Electron Transfer Dissociation. Cell Reports 32, doi:ARTN 10817610.1016/j.celrep.2020.108176 (2020).
[4] Hendriks, I. A., Larsen, S. C. & Nielsen, M. L. An Advanced Strategy for Comprehensive Profiling of ADP-ribosylation Sites Using Mass Spectrometry-based Proteomics. Molecular & Cellular Proteomics 18, 1010-1026, doi:10.1074/mcp.TIR119.001315 (2019).
[5] Luo, F., Wang, M., Liu, Y., Zhao, X. M. & Li, A. DeepPhos: prediction of protein phosphorylation sites with deep learning. Bioinformatics 35, 2766-2773, doi:10.1093/bioinformatics/bty1051 (2019).
[6] Wang, C. et al. GPS 5.0: An Update on the Prediction of Kinase-specific Phosphorylation Sites in Proteins. Genomics Proteomics Bioinformatics 18, 72-80, doi:10.1016/j.gpb.2020.01.001 (2020).