Scientific data is doubling every year. Virtual Observatories are established over every scale of the physical world: from
elementary particles to materials, biological systems, environmental observatories, remote sensing, and the universe. These
collaborations collect increasing amounts of data, often close to a rate of petabytes per year. Many scientists will soon
obtain most of their data from large scientific repositories of data, often stored in the form of databases. The talk will
discuss the different requirements for such databases, and discuss user behavior in a few concrete examples taken from astronomy,
in particular from the 6 year usage of the Sloan Digital Sky Survey database. Interesting query patterns are emerging, where
users create custom “crawlers” to break large queries into many repetitive ones. The trial-and-error behavior of many exploratory
projects will be also discussed. The talk will also present various scalable alternatives to large scientific analysis facilities.