CREDO: A Structural Interactomics Database For Drug Discovery

CREDO is a relational database storing all pairwise atomic interactions of inter- as well as intra-molecular contacts between small- and macromolecules found in experimentally-determined structures from the Protein Data Bank (PDB).

Structural Interactomics

CREDO contains the interactions between all molecules inside macromolecular complexes from the Protein Data Bank (PDB). These molecules include proteins, nucleic acids, carbohydrates as well as small molecules. Interactions between atoms are stored as Structural Interaction Fingerprints (SIFts) that were described first by Deng et al.. CREDO currently implements 13 different interaction types such as hydrogen bonds, halogen bonds, carbonyl interactions and others.

All polypeptide residues in CREDO are mapped onto protein sequences from UniProt through a complete sequence-to-structure mapping if possible, using data from the PDBe Structure integration with function, taxonomy and sequence (SIFTS) initiative. This mapping allows the transfer of information from the sequence onto the structure (or vice-versa), including cross references to other databases. For example, information from UniProt is used to identify modified, non-standard or mutated peptides in PDB structures.

Structural Variations

Structural variations from EnsEMBL Variation are mapped onto all protein structures in CREDO through the sequence-to-structure mapping. EnsEMBL Variation contains variation data from the most important sources, including dbSNP, COSMIC and UniProt as well as information about (disease) phenotypes that can be linked to variations occurring in protein structures. This means that phenotypes can be linked directly to ligand binding sites or protein-protein interfaces.

A range of cheminformatics routines are supported to retrieve chemical components. Structural queries are supported in the form of sub-/superstructure as well as SMARTS. Chemical components can also be searched through topological similarity with circular, atom-pair and torsion fingerprints. You can fetch all compounds that contain 4-(3-pyridyl)pyrimidine by accessing this resource, for example. 3D similarity searching is also supported through Ultrafast Shape Recognition (USR).

Data Validation

One of the design decisions that was made for CREDO was to be able to keep as much data from PDB as possible. Therefore, data in CREDO is annotated with additional data that can be used assess the quality of a macromolecular complex. This includes structure factors, boolean flags that indicate missing, disordered or clashing atoms as well as information about regions of missing residues.

$ curl -Haccept:application/json -L ''

Almost all website resources accept JSON requests which is convenient for advanced users who want to query CREDO from the command line, use it from within a script or embed data into other websites.

