The computer industry is examining the use of strong syn- chronization operations such as double compare-and-swap (DCAS) as
a means of supporting non-blocking synchronization on tomorrow’s mul- tiprocessor machines. However, before such a primitive
will be incorpo- rated into hardware design, its utility needs to be proven by developing a body of effective non-blocking
data structures using DCAS.
In a previous paper, we presented two linearizable non-blocking imple- mentations of concurrent deques (double-ended queues)
using the DCAS operation. These improved on previous algorithms by nearly always al- lowing unimpeded concurrent access to
both ends of the deque while correctly handling the difficult boundary cases when the deque is empty or full. A remaining
open question was whether, using DCAS, one can design a non-blocking implementation of concurrent deques that allows dynamic
memory allocation but also uses only a single DCAS per push or pop in the best case.
This paper answers that question in the affirmative. We present a new non-blocking implementation of concurrent deques using
the DCAS operation. This algorithm provides the benefits of our previous techniques while overcoming drawbacks. Like our previous
approaches, this implementation relies on automatic storage reclamation to ensure that a storage node is not reclaimed and
reused until it can be proved that the node is not reachable from any thread of control. This algorithm uses a linked-list
representation with dynamic node allocation and therefore does not impose a fixed maximum capacity on the deque. It does not
equire the use of a “spare bit” in pointers. In the best case (no interference), it requires only one DCAS per push and one
DCAS per pop. We also sketch a proof of correctness.