Guidelines


Depositing a Dataset

This is the two-step process of registering and uploading your data for public access.

What you need to get started:

  1. PDB code
  2. Original diffraction images
  3. Small image (200x200 pixels) representing your structure

Step 1. Register your data online

  • Instructions and examples are included on the registration form and will display when you hover your mouse over one of the blue field headings.
  • If a corresponding PDB file is available you can copy most of this information from the PDB records. The PDB Methods section includes the beam line name and data collection date. The publication DOI is typically listed in the summary section, or look for it in PubMed Central.
  • The Data Collection Date can often be found within your image files. Just run the command: ‘head -n 30 imagefile.img’. This command lists your image header (e.g. in Q315 images date was included in line #10: DATE=Wed Sep 8 03:11:40 2010;)
  • You should upload a small 200x200 jpeg image depicting your structure. Jpeg can be generated using our website: just replace PDBCODE in the following URL with your PDB code https://data.sbgrid.org/visualize/PDBCODE/ . Then, snapshot the image and adjust the image size with OS X Preview application (Tools/Adjust Size) before uploading through the deposition form.
  • Upon submission:
    • Your dataset is assigned a unique ID.
    • The Principal Investigator (head of the SBGrid member laboratory) will receive an email detailing the content of the submission.
    • The depositor will receive an email with a data upload script.

Step 2. Upload your data via rsync

  • Follow the directions in the upload script emailed upon data registration.
  • Copy all data frames (frequently with extension .img, .cbf, or numerical extension, and sequentially numbered, e.g. native_001.img - native_180.img) to a single directory on your Linux or OS X workstation. Do not include any additional files (e.g. mtz, sca, log, pdb, or pdf file).
  • Do not zip, tar, or compress your files. Having them in a single directory will suffice.
  • Download the upload script to the workstation where data files are stored.
  • Run the script, e.g. 'bash ./upload2sbdb_dataset-No6.bash’. The script will ask you for a full path (beginning with a '/') to the directory containing the dataset. All files in this directory will be considered part of the deposited dataset.

Upon completion of Step 2 you will receive an email with a Digital Object Identifier (DOI) for your data. If necessary, SBGrid staff may decompress files or remove extraneous files, and re-compute checksums. The deposited data will be listed on the data.sbgrid.org webpage and released to community on Tuesdays and Fridays at 5pm.

Congratulations! You have just published your data! You can find your data by using its unique ID and navigating to:

data.sbgrid.org/dataset/DATAID