Follow
Keren Zhou
Title
Cited by
Cited by
Year
Understanding the GPU microarchitecture to achieve bare-metal performance tuning
X Zhang, G Tan, S Xue, J Li, K Zhou, M Chen
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of …, 2017
672017
Tools for top-down performance analysis of GPU-accelerated applications
K Zhou, M Krentel, J Mellor-Crummey
Proceedings of the 34th ACM International Conference on Supercomputing 26, 1–12, 2020
272020
A performance analysis framework for exploiting GPU microarchitectural capability
K Zhou, G Tan, X Zhang, C Wang, N Sun
Proceedings of the International Conference on Supercomputing, 1-10, 2017
212017
Multi-classes feature engineering with sliding window for purchase prediction in mobile commerce
Q Li, M Gu, K Zhou, X Sun
2015 IEEE International Conference on Data Mining Workshop (ICDMW), 1048-1054, 2015
202015
GVProf: A value profiler for GPU-based clusters
K Zhou, Y Hao, J Mellor-Crummey, X Meng, X Liu
SC20: International Conference for High Performance Computing, Networking …, 2020
182020
GPA: A GPU Performance Advisor Based on Instruction Sampling
K Zhou, X Meng, R Sai, J Mellor-Crummey
2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2021
172021
Accelerating high‐order stencils on GPUs
R Sai, J Mellor-Crummey, X Meng, K Zhou, M Araya-Polo, J Meng
Concurrency and Computation: Practice and Experience 34 (20), 2021
162021
Measurement and analysis of GPU-accelerated applications with HPCToolkit
K Zhou, L Adhianto, J Anderson, A Cherian, D Grubisic, M Krentel, Y Liu, ...
Parallel Computing 108, 102837, 2021
122021
An automated tool for analysis and tuning of gpu-accelerated code in hpc applications
K Zhou, X Meng, R Sai, D Grubisic, J Mellor-Crummey
IEEE Transactions on Parallel and Distributed Systems 33 (4), 854-865, 2021
112021
Outcomes of openMP hackathon: openMP application experiences with the offloading model (part II)
B Chapman, B Pham, C Yang, C Daley, C Bertoni, D Kulkarni, ...
OpenMP: Enabling Massive Node-Level Parallelism: 17th International Workshop …, 2021
112021
ValueExpert: Exploring value patterns in GPU-Accelerated applications
K Zhou, Y Hao, J Mellor-Crummey, X Meng, X Liu
Proceedings of the 27th ACM International Conference on Architectural …, 2022
92022
Outcomes of OpenMP Hackathon: OpenMP Application Experiences with the Offloading Mode
S Pophale, D Oryspayev, B Chapman, B Pham, C Yang, C Daley, ...
Brookhaven National Lab.(BNL), Upton, NY (United States), 2021
62021
基于并发跳表的云数据处理双层索引架构研究
周维, 路劲, 周可人, 王世普, 姚绍文
计算机研究与发展 52 (7), 1531-1545, 2015
62015
Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs
AT Cherian, K Zhou, D Grubisic, X Meng, J Mellor-Crummey
2021 IEEE/ACM International Workshop on Programming and Performance …, 2021
42021
Quadboost: A scalable concurrent quadtree
K Zhou, G Tan, W Zhou
IEEE Transactions on Parallel and Distributed Systems 29 (3), 673-686, 2017
42017
PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation
J Ansel, E Yang, H He, N Gimelshein, A Jain, M Voznesensky, B Bao, ...
32024
Low overhead and context sensitive profiling of gpu-accelerated applications
K Zhou, J Anderson, X Meng, J Mellor-Crummey
Proceedings of the 36th ACM International Conference on Supercomputing, 1-13, 2022
32022
Semi-supervised learning for shale image segmentation with fast normalized cut loss
B Yin, Q Hu, Y Zhu, K Zhou
Geoenergy Science and Engineering 229, 212039, 2023
22023
Hardware-aware compression with random operation access specific tile (ROAST) hashing
A Desai, K Zhou, A Shrivastava
International Conference on Machine Learning, 7732-7749, 2023
22023
DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications
M Lin, K Zhou, P Su
Proceedings of the 28th ACM International Conference on Architectural …, 2023
22023
The system can't perform the operation now. Try again later.
Articles 1–20