You have Guest access.
Log In
Henk Sips, Dick Epema and Hai-Xiang Lin
Front matter
1-2
Multicore Programming Challenges
3
Ibis: A Programming System for Real-World Distributed Computing
4
What Is in a Namespace?
7-8
Introduction
9-20
Atune-IL: An Instrumentation Language for Auto-tuning Parallel Applications
21-32
Assigning Blame: Mapping Performance to High Level Parallel Programming Abstractions
33-44
A Holistic Approach towards Automated Performance Analysis and Tuning
45-56
Pattern Matching and I/O Replay for POSIX I/O in Parallel Programs
57-68
An Extensible I/O Performance Analysis Framework for Distributed Environments
69-80
Grouping MPI Processes for Partial Checkpoint and Co-migration
81-92
Process Mapping for MPI Collective Communications
95-96
97-109
Stochastic Analysis of Hierarchical Publish/Subscribe Systems
110-121
Characterizing and Understanding the Bandwidth Behavior of Workloads on Multi-core Processors
122-134
Hybrid Techniques for Fast Multicore Simulation
135-148
PSINS: An Open Source Event Tracer and Execution Simulator for MPI Applications
149-161
A Methodology to Characterize Critical Section Bottlenecks in DSM Multiprocessors
165
166-177
Dynamic Load Balancing of Matrix-Vector Multiplications on Roadrunner Compute Nodes
178-190
A Unified Framework for Load Distribution and Fault-Tolerance of Application Servers
191-202
On the Feasibility of Dynamically Scheduling DAG Applications on Shared Heterogeneous Systems
203-215
Steady-State for Batches of Identical Task Trees
216-227
A Buffer Space Optimal Solution for Re-establishing the Packet Order in a MPSoC Network Processor
228-240
Using Multicast Transfers in the Replica Migration Problem: Formulation and Scheduling Heuristics
241-252
A New Genetic Algorithm for Scheduling for Large Communication Delays
253-264
Comparison of Access Policies for Replica Placement in Tree Networks
265-280
Scheduling Recurrent Precedence-Constrained Task Graphs on a Symmetric Shared-Memory Multiprocessor
281-292
Energy-Aware Scheduling of Flow Applications on Master-Worker Platforms
295-296
297-308
Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs
309-320
Paired ROBs: A Cost-Effective Reorder Buffer Sharing Strategy for SMT Processors
321-333
REPAS: Reliable Execution for Parallel ApplicationS in Tiled-CMPs
334-344
Impact of Quad-Core Cray XT4 System and Software Stack on Scientific Computation
347-348
349-360
Unifying Memory and Database Transactions
361-374
A DHT Key-Value Storage System with Carrier Grade Performance
375-386
Selective Replicated Declustering for Arbitrary Queries
389
390-403
POGGI: Puzzle-Based Online Games on Grid Infrastructures
404-416
Enabling High Data Throughput in Desktop Grids through Decentralized Data and Metadata Management: The BlobSeer Approach
417-428
MapReduce Programming Model for .NET-Based Cloud Computing
429-441
The Architecture of the XtreemOS Grid Checkpointing Service
442-453
Scalable Transactions for Web Applications in the Cloud
454-465
Provider-Independent Use of the Cloud
466-477
MPI Applications on Grids: A Topology Aware Approach
481-482
483-497
A Least-Resistance Path in Reasoning about Unstructured Overlay Networks
498-510
SiMPSON: Efficient Similarity Search in Metric Spaces over P2P Structured Overlay Networks
511-522
Uniform Sampling for Directed P2P Networks
523-534
Adaptive Peer Sampling with Newscast
535-547
Exploring the Feasibility of Reputation Models for Improving P2P Routing under Churn
548-560
Selfish Neighbor Selection in Peer-to-Peer Backup and Storage Applications
561-573
Zero-Day Reconciliation of BitTorrent Users with Their ISPs
574-586
Surfing Peer-to-Peer IPTV: Distributed Channel Switching
589
590-601
Distributed Individual-Based Simulation
602-614
A Self-stabilizing K-Clustering Algorithm Using an Arbitrary Metric
615-626
Active Optimistic Message Logging for Reliable Execution of MPI Applications
629
630-641
A Parallel Numerical Library for UPC
642-653
A Multilevel Parallelization Framework for High-Order Stencil Computations
654-665
Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores
666-677
Parallel Skeletons for Variable-Length Lists in SkeTo Skeleton Library
678-690
Stkm on Sca: A Unified Framework with Components, Workflows and Algorithmic Skeletons
691-703
Grid-Enabling SPMD Applications through Hierarchical Partitioning and a Component-Based Runtime
704-715
Reducing Rollbacks of Transactional Memory Using Ordered Shared Locks
719-720
721-734
Wavelet-Based Adaptive Solvers on Multi-core Architectures for the Simulation of Complex Systems
735-746
Localized Parallel Algorithm for Bubble Coalescence in Free Surface Lattice-Boltzmann Method
747-759
Fast Implicit Simulation of Oscillatory Flow in Human Abdominal Bifurcation Using a Schur Complement Preconditioner
760-771
A Parallel Rigid Body Dynamics Algorithm
772-784
Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems
785-796
Parallel Implementation of Runge–Kutta Integrators with Low Storage Requirements
797-808
PSPIKE: A Parallel Hybrid Sparse Linear System Solver
809-820
Out-of-Core Computation of the QR Factorization on Multi-core Processors
821-833
Adaptive Parallel Householder Bidiagonalization
837-838
839-850
Tile Percolation: An OpenMP Tile Aware Parallelization Technique for the Cyclops-64 Multicore Processor
851-862
An Extension of the StarSs Programming Model for Platforms with Multiple GPUs
863-874
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
875-886
XJava: Exploiting Parallelism with Object-Oriented Stream Programming
887-899
JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA
900-911
Fast and Efficient Synchronization and Communication Collective Primitives for Dual Cell-Based Blades
912-923
Searching for Concurrent Design Patterns in Video Games
924-935
Parallelization of a Video Segmentation Algorithm on CUDA–Enabled Graphics Processing Units
936-947
A Parallel Point Matching Algorithm for Landmark Based Image Registration Using Multicore Platform
948-959
High Performance Matrix Multiplication on Many Cores
960-973
Parallel Lattice Basis Reduction Using a Multi-threaded Schnorr-Euchner LLL Algorithm
974-985
Efficient Parallel Implementation of Evolutionary Algorithms on GPGPU Cards
989
990-1002
Implementing Parallel Google Map-Reduce in Eden
1003-1010
A Lower Bound for Oblivious Dimensional Routing
1013-1014
1015-1028
A Case Study of Communication Optimizations on 3D Mesh Interconnects
1029-1039
Implementing a Change Assimilation Mechanism for Source Routing Interconnects
1040-1051
Dependability Analysis of a Fault-Tolerant Network Reconfiguring Strategy
1052-1064
RecTOR: A New and Efficient Method for Dynamic Network Reconfiguration
1065-1077
NIC-Assisted Cache-Efficient Receive Stack for Message Passing over Ethernet
1078-1088
A Multipath Fault-Tolerant Routing Method for High-Speed Interconnection Networks
1089-1100
Hardware Implementation Study of the SCFQ-CA and DRR-CA Scheduling Algorithms
1103
1104-1115
Optimal and Near-Optimal Energy-Efficient Broadcasting in Wireless Networks
Back matter
This page requires script.
Frequently asked questions General info on journals and books Send us your feedback Impressum Contact us
© Springer, Part of Springer Science+Business Media Privacy, Disclaimer, Terms & Conditions, and Copyright Info