Algorithms for sorting large datasets can be made more efficient with careful use of memory hierarchies and reduction in the
number of costly memory accesses. In earlier work, we introduced burstsort, a new string sorting algorithm that on large sets
of strings is almost twice as fast as previous algorithms, primarily because it is more cache-efficient. The approach in burstsort
is to dynamically build a small trie that is used to rapidly allocate each string to a bucket. In this paper, we introduce
new variants of our algorithm: SR-burstsort, DR-burstsort, and DRL-burstsort. These algorithms use a random sample of the
strings to construct an approximation to the trie prior to sorting. Our experimental results with sets of over 30 million
strings show that the new variants reduce cache misses further than did the original burstsort, by up to 37%, while simultaneously
reducing instruction counts by up to 24%. In pathological cases, even further savings can be obtained.