Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that
reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related
to the semantics of data exchange and to query answering in the context of data exchange. These issues arise because, given
a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an
algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that
we call universal. A universal solution has no more and no less data than required for data exchange and it represents the entire space of
possible solutions. We then identify fairly general, and practical, conditions that guarantee the existence of a universal
solution and yield algorithms to compute a canonical universal solution efficiently.We adopt the notion of “certain answers”
in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity
of computing the certain answers in this context and also study the problem of computing the certain answers of target queries
by simply evaluating them on a canonical universal solution.
Research carried out while these authors were visiting scientists at the IBM Almaden Research Center. Kolaitis was also partially
supported by NSF Grant IIS-9907419. Miller was also partially supported by a research grant from NSERC.