The recent literature is replete with papers evaluating computational tools (often those operating on 3D structures) for their
performance in a certain set of tasks. Most commonly these papers compare a number of docking tools for their performance
in cognate re-docking (pose prediction) and/or virtual screening. Related papers have been published on ligand-based tools:
pose prediction by conformer generators and virtual screening using a variety of ligand-based approaches. The reliability
of these comparisons is critically affected by a number of factors usually ignored by the authors, including bias in the datasets
used in virtual screening, the metrics used to assess performance in virtual screening and pose prediction and errors in crystal
structures used.
Keywords Software evaluation - Pose prediction - Coordinate error - Virtual screening - Property bias
An erratum to this article can be found at
http://dx.doi.org/10.1007/s10822-008-9201-z