With the growing number of citations and papers in the chemical literature, it’s not an understatement to say that the role of water molecules in drug design is a hot topic. Conceptually, it’s easy to see why.

If we know where water likes to be in and around a protein binding site then we can design inhibitors that try and displace the weakly bound ones and interact with the strongly bound ones. This, in turn, increases the binding free energy of the ligand by exploiting the hydrophobic effect.

Yet locating the waters around a binding site is not as trivial as it seems; subtle cooperative effects between water and the protein can result in poorly hydrated or dry regions of the binding site, whilst certain solvent-accessible, binding sites are known to be completely devoid of water. So with this in mind, how can computer-aided drug design (CADD) help in identifying a realistic water distribution around a protein binding site for use in drug design?

There are countless tools available for predicting the location and affinity of water molecules in both unbound- and ligand-bound-protein binding sites. For example, Schrödinger’s WaterMap[1] predicts the location of waters through a Molecular Dynamics (MD) simulation, which is then post-processed to yield the energy and entropy of each predicted site. The 3D-RISM technique[2], implemented in a variety of software packages, uses a rigorous first-principles approach to predict both the water and hydrophobic density around a protein. And the recently described Grand Canonical Integration[3] approach uses an ensemble of Monte Carlo simulations—so-named because the algorithm, which relies on random numbers, made the inventors think of a casino—to capture cooperative effects between water, ligand and receptor, whilst yielding accurate binding free energies of the resultant water networks. All of these methods have been validated against known crystal structures, demonstrating their suitability for water location and affinity prediction. An example of the 3D-RISM method in predicting the locations of waters in N9-neuraminidase, a protein found on the surface of influenza viruses and responsible for viral replication, is shown below in Figure 1.

Figure 1. Molecule
Figure 1. The predicted water density (yellow) around the active site of N9-neuraminidase, in complex with the activated form of Oseltamivir. Note how the crystallographically-resolved waters, shown as red and white spheres, are found in close proximity to the density, indicating that the method has been successful in finding the major hydration sites.

Bayesian approaches have been described which determine, given a particular binding affinity, whether a water molecule is likely to be displaceable.[4] This, however, is only half of the story. Predicting whether a ligand will actually displace a particular water molecule is an extremely challenging task, requiring knowledge of the binding affinity of both the water molecule in question[5] and that of the proposed ligand. The latter is notoriously difficult to estimate, and typically requires further techniques such as Free Energy Perturbation (FEP) or Thermodynamic Integration (TI) to be employed. Despite the increase in computational power through GPU-processing, such procedures are still generally too intensive to be of value in every day CADD usage.

This certainly isn’t a limiting factor in using water distributions, however. Simply identifying weakly bound water is often good enough to warrant medicinal chemistry efforts in that region of interest, whilst performing solvent analysis around new leads can help to explain binding poses, kinetics[6] or even selectivity against potential off-targets.[7] Observing how water molecules bind to the unbound-protein can help to understand hydrophobic and hydrophilic parts of the binding pocket, and can offer new avenues in ligand design which aim to interact with waters at the periphery of their network.

Water used to be seen as a passive element in the drug-design process; now, through the use of CADD and solvent analysis tools, it is anything but.


  1. Young T, Abel R, Kim B, Berne BJ, and Friesner RA., Motifs for molecular recognition exploiting hydrophobic enclosure in protein-ligand binding, PNAS, 104: 808-813, 2007 
  2. Sindhikara D, Hirata F., Analysis of Biomolecular Solvation Sites by 3D-RISM Theory, J. Phys. Chem. B, 117, 6718-6723, 2013 
  3. Ross G, Bodnarchuk MS, Essex JW., Water sites, networks, and free energies with grand canonical Monte Carlo, J. Am. Chem. Soc., 137(47), 14930-14943, 2015 
  4. Barillari C, Taylor J, Viner R, Essex JW., Classification of water molecules in protein binding sites, J. Am. Chem. Soc., 129, 2577-2587, 2007 
  5. Bodnarchuk MS, Viner R, Michel J, Essex JW., Strategies to Calculate Water Binding Free Energies in Protein–Ligand Complexes, J. Chem. Inf. Model, 54, 1623-1633, 2014 
  6. Bortolato A, Tehan BG, Bodnarchuk MS, Essex JW, Mason JS., Water Network Perturbation in Ligand Binding: Adenosine A2A Antagonists as a Case Study, J. Chem. Inf. Model, 53, 1700-1713, 2013 
  7. Robinson DD, Sherman W, Farid R., Understanding kinase selectivity through energetic analysis of binding site waters, ChemMedChem, 5, 618-627, 2010 


How to cite:

Bodnarchuk, Michael. Liquid Assets. Eureka blog. Jan 4, 2016. Available: http://eureka.criver.com/liquid-assets/