Clickthrough data has been the subject of increasing popularity as an implicit indicator of user feedback. Previous analysis
has suggested that user click behaviour is subject to a quality bias—that is, users click at different rank positions when
viewing effective search results than when viewing less effective search results. Based on this observation, it should be
possible to use click data to infer the quality of the underlying search system. In this paper we carry out a user study to
systematically investigate how click behaviour changes for different levels of search system effectiveness as measured by
information retrieval performance metrics. Our results show that click behaviour does not vary systematically with the quality
of search results. However, click behaviour does vary significantly between individual users, and between search topics. This
suggests that using direct click behaviour—click rank and click frequency—to infer the quality of the underlying search system
is problematic. Further analysis of our user click data indicates that the correspondence between clicks in a search result
list and subsequent confirmation that the clicked resource is actually relevant is low. Using clicks as an implicit indication
of relevance should therefore be done with caution.