USENIX ATC '23 - VectorVisor: A Binary Translation Scheme for Throughput-Oriented GPU Acceleration
USENIX ATC '23 - VectorVisor: A Binary Translation Scheme for Throughput-Oriented GPU Acceleration Samuel Ginzburg, Princeton University, Mohammad Shahrad, University of British Columbia, Michael J. Freedman, Princeton University Beyond conventional graphics applications, general-purpose GPU acceleration has had significant impact on machine learning and scientific computing workloads. Yet, it has failed to see widespread use for server-side applications, which we argue is because GPU programming models offer a level of abstraction that is either too low-level (e.g., OpenCL, CUDA) or too high-level (e.g., TensorFlow, Halide), depending on the language. Not all applications fit into either category, resulting in lost opportunities for GPU acceleration. We introduce VectorVisor, a vectorized binary translator that enables new opportunities for GPU acceleration by introducing a novel programming model for GPUs. With VectorVisor, many copies of the same server-side application are run concurrently on the GPU, where VectorVisor mimics the abstractions provided by CPU threads. To achieve this goal, we demonstrate how to (i) provide cross-platform support for system calls and recursion using continuations and (ii) make full use of the excess register file capacity and high memory bandwidth of GPUs. We then demonstrate that our binary translator is able to transparently accelerate certain classes of compute-bound workloads, gaining significant improvements in throughput-per-dollar of up to 2.9 × compared to Intel x86-64 VMs in the cloud, and in some cases match the throughput-per-dollar of native CUDA baselines. View the full USENIX ATC '23 program at https://www.usenix.org/conference/atc23/technical-sessions
Download
0 formatsNo download links available.