Protein–protein interaction
Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and the hydrophobic effect. Many are physical contacts with molecular associations between chains that occur in a cell or in a living organism in a specific biomolecular context.
Proteins rarely act alone as their functions tend to be regulated. Many molecular processes within a cell are carried out by molecular machines that are built from numerous protein components organized by their PPIs. These physiological interactions make up the so-called interactomics of the organism, while aberrant PPIs are the basis of multiple aggregation-related diseases, such as Creutzfeldt–Jakob and Alzheimer's diseases.
PPIs have been studied with many methods and from different perspectives: biochemistry, quantum chemistry, molecular dynamics, signal transduction, among others.[1][2][3] All this information enables the creation of large protein interaction networks[4] – similar to metabolic or genetic/epigenetic networks – that empower the current knowledge on biochemical cascades and molecular etiology of disease, as well as the discovery of putative protein targets of therapeutic interest.
Properties of the interface[edit]
The study of the molecular structure can give fine details about the interface that enables the interaction between proteins. When characterizing PPI interfaces it is important to take into account the type of complex.[9]
Parameters evaluated include size (measured in absolute dimensions Å2 or in solvent-accessible surface area (SASA)), shape, complementarity between surfaces, residue interface propensities, hydrophobicity, segmentation and secondary structure, and conformational changes on complex formation.[9]
The great majority of PPI interfaces reflects the composition of protein surfaces, rather than the protein cores, in spite of being frequently enriched in hydrophobic residues, particularly in aromatic residues.[25] PPI interfaces are dynamic and frequently planar, although they can be globular and protruding as well.[26] Based on three structures – insulin dimer, trypsin-pancreatic trypsin inhibitor complex, and oxyhaemoglobin – Cyrus Chothia and Joel Janin found that between 1,130 and 1,720 Å2 of surface area was removed from contact with water indicating that hydrophobicity is a major factor of stabilization of PPIs.[27] Later studies refined the buried surface area of the majority of interactions to 1,600±350 Å2. However, much larger interaction interfaces were also observed and were associated with significant changes in conformation of one of the interaction partners.[18] PPIs interfaces exhibit both shape and electrostatic complementarity.[9][11]
Databases[edit]
Large scale identification of PPIs generated hundreds of thousands of interactions, which were collected together in specialized biological databases that are continuously updated in order to provide complete interactomes. The first of these databases was the Database of Interacting Proteins (DIP).[62]
Primary databases collect information about published PPIs proven to exist via small-scale or large-scale experimental methods. Examples: DIP, Biomolecular Interaction Network Database (BIND), Biological General Repository for Interaction Datasets (BioGRID), Human Protein Reference Database (HPRD), IntAct Molecular Interaction Database, Molecular Interactions Database (MINT), MIPS Protein Interaction Resource on Yeast (MIPS-MPact), and MIPS Mammalian Protein–Protein Interaction Database (MIPS-MPPI).<
Meta-databases normally result from the integration of primary databases information, but can also collect some original data.
Prediction databases include many PPIs that are predicted using several techniques (main article). Examples: Human Protein–Protein Interaction Prediction Database (PIPs),[63] Interlogous Interaction Database (I2D), Known and Predicted Protein–Protein Interactions (STRING-db), and Unified Human Interactive (UniHI).
The aforementioned computational methods all depend on source databases whose data can be extrapolated to predict novel protein–protein interactions. Coverage differs greatly between databases. In general, primary databases have the fewest total protein interactions recorded as they do not integrate data from multiple other databases, while prediction databases have the most because they include other forms of evidence in addition to experimental. For example, the primary database IntAct has 572,063 interactions,[64] the meta-database APID has 678,000 interactions,[65] and the predictive database STRING has 25,914,693 interactions.[66] However, it is important to note that some of the interactions in the STRING database are only predicted by computational methods such as Genomic Context and not experimentally verified.