Back to Browse

Taro: Task Graph-Based Asynchronous Programming Using C++ Coroutine – Dian-Lun Lin - CppCon 2023

11.9K views
Jan 10, 2024
1:01:43

https://cppcon.org/ --- Taro: Task graph-based Asynchronous Programming Using C++ Coroutine - Dian-Lun Lin - CppCon 2023 https://github.com/CppCon/CppCon2023 Task graph computing system (TGCS) plays an essential role in high-performance computing. Unlike loop-based models, TGCSs encapsulate function calls and their dependencies in a top-down task graph to implement irregular parallel decomposition strategies that scale to large numbers of processors, including manycore central processing units (CPUs) and graphics processing units (GPUs). As a result, recent years have seen a great deal amount of TGCS research, just name a few: Taskflow, oneTBB, Kokkos-DAG, and HPX. However, one common challenge faced by TGCSs is the issue of synchronization within each task. For instance, in scenarios where a task involves executing GPU operations, a CPU thread typically needs to wait until the GPU completes the operations before proceeding further. This synchronization overhead can hinder performance and limit the overall scalability of TGCSs. The introduction of C++ coroutines in C++20 has revolutionized asynchronous programming, offering improved concurrency and expressiveness. However, integrating TGCS with C++ coroutines presents several challenges. Firstly, existing TGCS solutions are not compatible with C++ coroutines, as the coroutine paradigm deviates from traditional C++ programming. This incompatibility makes it difficult to seamlessly incorporate coroutines into existing TGCS frameworks. Secondly, C++ coroutine programming is extremely difficult and requires a solid understanding of the underlying concepts and mechanisms. The introduction of a new paradigm adds complexity and a steep learning curve for developers. Lastly, while C++ coroutines offer a powerful mechanism for managing asynchronous operations, designing and implementing an efficient scheduler to leverage their capabilities remains challenging. To fully exploit the benefits of C++ coroutines, there is a need for a specialized scheduler that can handle large numbers of coroutines and make optimal use of hardware resources. To address these challenges, we present Taro: Task-Graph-Based Asynchronous Programming using C++ Coroutine. Taro offers a task-graph-based programming model for C++ coroutines, simplifying the expression of complex control flows and reducing development complexity. Additionally, Taro incorporates an efficient work-stealing scheduling algorithm tailored for C++ coroutines, minimizing unnecessary context switches, CPU migrations, and cache misses. In this session, I will introduce Taro's programming model and demonstrate how Taro can enable efficient multitasking between CPU and GPU tasks, avoiding blocking wait on CPU threads for GPU tasks to finish. I will show the example code for using Taro. Finally, I will demonstrate how our solution can improve the performance of a real-world RTL simulation workload and microbenchmarks. Taro will be open-source and available on GitHub. --- Dian-Lun Lin I’m a fourth-year Ph.D. student at the Department of Electrical and Computer Engineering at the University of Wisconsin-Madison. My research interests focus on parallel computing and GPU computing using C++ and CUDA. During my recent three-year Ph.D. studies, I have published three top-tier papers (DAC 2023, ICPP 2022, and Euro-Par 2021) and one top-tier journal (IEEE TPDS 2022), all as the first author. I received second place in ACM/PACT Student Research Competeition (SRC 2022). I also received champion award in a research competition (IEEE HPEC Challenge 2020). I am a presenter at the CppCon 2023, CppNow 2023, and CppCon 2021. I also give talks at MediaTek Research, Berkeley National Lab, and NVIDIA Research. My recent work focuses on building a CPU-GPU task programming system using modern C++ Coroutine and CUDA. --- Work at Hudson River Trading (HRT): https://tinyurl.com/safxfctf --- Videos Filmed & Edited by Bash Films: http://www.BashFilms.com YouTube Channel Managed by Digital Medium Ltd: https://events.digital-medium.co.uk --- Registration for CppCon: https://cppcon.org/registration/ #cppcon #cppprogramming #cpp

Download

0 formats

No download links available.

Taro: Task Graph-Based Asynchronous Programming Using C++ Coroutine – Dian-Lun Lin - CppCon 2023 | NatokHD