Data DOI: 10.15785/SBGRID/1155 | ID: 1155
Walter Laboratory, Harvard Medical School
Release Date: 4 Feb 2025
1. If this dataset is locally available, it should be accessable at /programs/datagrid/1155
2. To download this dataset, please run the following command from your Terminal on a Linux or OS X workstation:
'rsync -av rsync://data.sbgrid.org/10.15785/SBGRID/1155 .'
(Harvard Medical School, USA)
Depending on your location, faster access may be available from a Tier 1 site closer to your location
'rsync -av rsync://sbgrid.icm.uu.se/10.15785/SBGRID/1155 .'
(Uppsala University, Sweden)
'rsync -av rsync://sbgrid.pasteur.edu.uy/10.15785/SBGRID/1155 .'
(Institut Pasteur de Montevideo, Uruguay)
'rsync -av rsync://sbgrid.ncpss.org/10.15785/SBGRID/1155 .'
(Shanghai Institutes for Biological Sciences, China)
3. After the transfer is completed, please issue the following command to verify data integrity:
'cd 1155 ; shasum -c files.sha'
Storage requirements: 446G
Biological Sample:
N/A
Dataset Type:
Structural Model
Subject Composition:
Protein
Collection Facility:
N/A
Data Creation Date:
1 Jul 2024
Related Datasets:
None
Schmid, EW; Walter, J. 2025. "N/A.", SBGrid Data Bank, V1, https://doi.org/10.15785/SBGRID/1155.
The set of all AlphaFold multimer (AF-M) v2.3 pairwise structure predictions accompanying the publication: Predictomes: A classifier-curated database of AlphaFold-modeled protein-protein interactions. This dataset includes prediction pairs used for training random forest classifiers including SPOC, pairs used for 30 ranking experiments, all pairs that belong to the genome maintenance matrix on predictomes.org, and three proteome wide in-silico interaction screens conducted with human DONSON, human STK19, and human USP37. All pairs were generated with ColabFold v1.5.2. All our predictions used AF-M multimer version 3 weights models 1, 2, and 4 with 3 recycles, templates enabled, 1 ensemble, no dropout, and no AMBER relaxation. The Multiple Sequence Alignments (MSAs) (unpaired + paired) supplied to AF-M were generated by the MMSeqs2 server using default settings. Sequences run were generally capped at 3,600 amino acids total to avoid memory exhaustion on GPUs.
Name | Additional Roles | Affiliation While Working on the Project |
---|---|---|
Ernst W Schmid | Data Collector, Depositor | Harvard Medical School |
Johannes Walter | PI | Harvard Medical School |
none
License: CC0
Terms: Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation, as generated by the SBGrid Data Bank.