Train
Guide to Understanding PDB Data
Training Courses
Education Corner
PDB and Data Archiving Curriculum

Unlock Rapid Analyses Across the Whole PDB Using BinaryCIF

Webinar hosted by RCSB PDB and Rutgers Institute for Quantitative Biomedicine and Part of ISCBAcademy and the Rutgers Artificial Intelligence and Data Collaboratory | November 4, 2024

As macromolecular structures available through the Protein Data Bank (PDB) archive continue to grow in complexity and size, traditional text data formats like PDBx/mmCIF and the legacy PDB file format are becoming increasingly inefficient for transfer and parsing. To support scalable data analysis, binary formats and compression techniques are now essential. Learn how to future-proof your data analysis with BinaryCIF, a fully interchangeable yet drastically more efficient flavor of the PDBx/mmCIF format. BinaryCIF not only boosts storage efficiency, but also substantially improves parsing speed, making it ideal for large-scale analyses. BinaryCIF is supported by resources such as RCSB PDB, PDBe, and AlphaFold DB.

After watching the videos featured in this course, you will be able to:

  • Understand the basics of the PDBx/mmCIF schema
  • Access BinaryCIF files and related APIs on RCSB.org
  • Programmatically consume BinaryCIF data and convert between formats
  • Compute archive-wide statistics across the entire PDB
  • Gain hands-on experience with our Python parser

Additional materials for this course are available:

Click on the image below to play the video.

Introduction

Yana Rose

Scientific Software Developer and Data Architect, RCSB PDB/UCSD

CIF, mmCIF, and BinaryCIF Basics. Create archive-wide statistics with Java

Sebastian Bittrich

Scientific Software Developer, RCSB PDB/UCSD

Working with mmCIF and BinaryCIF in Python

Dennis W. Piehl

Scientific Software Developer and FAIR Manager, RCSB PDB/Rutgers

Q&A

Yana Rose, Sebastian Bittrich, Dennis W. Piehl