A data-intensive program is one in which much of the complexity and design effort is centred around data definition and manipulation.
Many organisations have substantial investment in data design (data structures and constraints) coded in data intensive programs.
While there is a rich collection of techniques that can extract data design from database schemas, the extraction of data
design from data intensive programs is still largely an unsolved problem. In this paper, we propose a query-based approach
to this problem. Our approach allows users (maintainers or reverse engineers) to express a complex extraction task as a sequence
of queries over the source program. Unlike conventional techniques, which are designed for extracting a specific aspect of
a data design, our approach gives the user the control over what to extract and how it may be extracted in an exploratory
manner. Given the variety of coding styles used in data intensive programs, we believe that the exploratory feature of our
approach represents a plausible way forward for extracting data design from data intensive programs. We demonstrate the usefulness
of our approach with a number of examples.