Efficient Synchronization of Replicated Data in Distributed Systems
Thorsten Schütt6, Florian Schintke6 and Alexander Reinefeld6
| (6) |
Zuse Institute Berlin (ZIB), Germany |
Abstract
We present nsync, a tool for synchronizing large replicated data sets in distributed systems. nsync computes nearly optimal synchronization plans based on a hierarchy of gossip algorithms that take the network topology into
account. Our primary design goals were maximum performance and maximum scalability. We achieved these goals by exploiting
parallelism in the planning and the synchronization phase, by omitting transfer of unnecessary metadata, by synchronizing
at a block level rather than a file level, and by using sophisticated compression methods. With its relaxed consistency semantic,
nsync neither needs a master copy nor a quorum for updating distributed replicas. Each replica is kept as an autonomous entity
and can be modified with the usual tools.
References secured to subscribers.