Follow
Tuowen Zhao
Tuowen Zhao
Sambanova Systems
Verified email at sambanovasystems.com
Title
Cited by
Cited by
Year
Exploiting reuse and vectorization in blocked stencil computations on CPUs and GPUs
T Zhao, P Basu, S Williams, M Hall, H Johansen
Proceedings of the International Conference for High Performance Computing …, 2019
332019
Delivering Performance-Portable Stencil Computations on CPUs and GPUs Using Bricks
T Zhao, S Williams, M Hall, H Johansen
2018 IEEE/ACM International Workshop on Performance, Portability and …, 2018
322018
Improving communication by optimizing on-node data movement with data layout
T Zhao, M Hall, H Johansen, S Williams
Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021
112021
Polyhedral Specification and Code Generation of Sparse Tensor Contraction with Co-Iteration
T Zhao, T Popoola, M Hall, C Olschanowsky, MM Strout
ACM Transactions on Architecture and Code Optimization 20 (1), 1-26, 2022
102022
SIMD code generation for stencils on brick decompositions
T Zhao, M Hall, P Basu, S Williams, H Johansen
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of …, 2018
42018
Code Synthesis for Sparse Tensor Format Conversion and Optimization
T Popoola, T Zhao, A St. George, K Bhetwal, MM Strout, M Hall, ...
Proceedings of the 21st ACM/IEEE International Symposium on Code Generation …, 2023
32023
Performance portability evaluation of blocked stencil computations on GPUs
O Antepara, S Williams, H Johansen, T Zhao, S Hirsch, P Goyal, M Hall
Proceedings of the SC'23 Workshops of The International Conference on High …, 2023
2023
Maximizing Performance Through Memory Hierarchy-Driven Data Layout Transformations
B Sepanski, T Zhao, H Johansen, S Williams
2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC …, 2022
2022
Optimizing Data Movement and Achieving Performance Portability with Fine-Grained Data Blocking
T Zhao
The University of Utah, 2022
2022
A Novel Variable-Blocking Representation for Efficient Sparse Matrix-Vector Multiply on GPUs
T Zhao, T Rusira, K Ahmad, M Hall
2016
Chapel With Polyhedral Transformation Using Autotuning
T Zhao, M Hall
2016
The system can't perform the operation now. Try again later.
Articles 1–11