A view of cloud computing M Armbrust, A Fox, R Griffith, AD Joseph, R Katz, A Konwinski, G Lee, ... Communications of the ACM 53 (4), 50-58, 2010 | 13988 | 2010 |
Spark: Cluster computing with working sets M Zaharia, M Chowdhury, MJ Franklin, S Shenker, I Stoica 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10), 2010 | 12065* | 2010 |
Above the clouds: A berkeley view of cloud computing M Armbrust, A Fox, R Griffith, AD Joseph, RH Katz, A Konwinski, G Lee, ... Technical Report UCB/EECS-2009-28, EECS Department, University of California …, 2009 | 8879 | 2009 |
Apache spark: a unified engine for big data processing M Zaharia, RS Xin, P Wendell, T Das, M Armbrust, A Dave, X Meng, ... Communications of the ACM 59 (11), 56-65, 2016 | 2943 | 2016 |
On the opportunities and risks of foundation models R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ... arXiv preprint arXiv:2108.07258, 2021 | 2762 | 2021 |
Mesos: A platform for {Fine-Grained} resource sharing in the data center B Hindman, A Konwinski, M Zaharia, A Ghodsi, AD Joseph, R Katz, ... 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), 2011 | 2563 | 2011 |
Improving MapReduce performance in heterogeneous environments. M Zaharia, A Konwinski, AD Joseph, RH Katz, I Stoica Osdi 8 (4), 7, 2008 | 2441 | 2008 |
Mllib: Machine learning in apache spark X Meng, J Bradley, B Yavuz, E Sparks, S Venkataraman, D Liu, ... Journal of Machine Learning Research 17 (34), 1-7, 2016 | 2364 | 2016 |
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling M Zaharia, D Borthakur, J Sen Sarma, K Elmeleegy, S Shenker, I Stoica Proceedings of the 5th European conference on Computer systems, 265-278, 2010 | 2006 | 2010 |
Spark sql: Relational data processing in spark M Armbrust, RS Xin, C Lian, Y Huai, D Liu, JK Bradley, X Meng, T Kaftan, ... Proceedings of the 2015 ACM SIGMOD international conference on management of …, 2015 | 1833 | 2015 |
Dominant resource fairness: Fair allocation of multiple resource types A Ghodsi, M Zaharia, B Hindman, A Konwinski, S Shenker, I Stoica 8th USENIX symposium on networked systems design and implementation (NSDI 11), 2011 | 1640 | 2011 |
Discretized streams: Fault-tolerant streaming computation at scale M Zaharia, T Das, H Li, T Hunter, S Shenker, I Stoica Proceedings of the twenty-fourth ACM symposium on operating systems …, 2013 | 1430 | 2013 |
Colbert: Efficient and effective passage search via contextualized late interaction over bert O Khattab, M Zaharia Proceedings of the 43rd International ACM SIGIR conference on research and …, 2020 | 982 | 2020 |
Managing data transfers in computer clusters with orchestra M Chowdhury, M Zaharia, J Ma, MI Jordan, I Stoica SIGCOMM 41 (4), 2011 | 811 | 2011 |
Sparrow: distributed, low latency scheduling K Ousterhout, P Wendell, M Zaharia, I Stoica Proceedings of the twenty-fourth ACM symposium on operating systems …, 2013 | 797 | 2013 |
Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters M Zaharia, T Das, H Li, S Shenker, I Stoica Proceedings of the 4th USENIX conference on Hot Topics in Cloud Computing, 10-10, 2012 | 789 | 2012 |
PipeDream: generalized pipeline parallelism for DNN training D Narayanan, A Harlap, A Phanishayee, V Seshadri, NR Devanur, ... Proceedings of the 27th ACM symposium on operating systems principles, 1-15, 2019 | 705 | 2019 |
Learning spark: lightning-fast big data analysis H Karau, A Konwinski, P Wendell, M Zaharia " O'Reilly Media, Inc.", 2015 | 694 | 2015 |
Shark: SQL and rich analytics at scale RS Xin, J Rosen, M Zaharia, MJ Franklin, S Shenker, I Stoica Proceedings of the 2013 ACM SIGMOD International Conference on Management of …, 2013 | 650 | 2013 |
Noscope: optimizing neural network queries over video at scale D Kang, J Emmons, F Abuzaid, P Bailis, M Zaharia arXiv preprint arXiv:1703.02529, 2017 | 499* | 2017 |
Job scheduling for multi-user mapreduce clusters M Zaharia, D Borthakur, JS Sarma, K Elmeleegy, S Shenker, I Stoica EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS …, 2009 | 495 | 2009 |
A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples SN Naccache, S Federman, N Veeraraghavan, M Zaharia, D Lee, ... Genome research 24 (7), 1180-1192, 2014 | 486 | 2014 |
Tachyon: Reliable, memory speed storage for cluster computing frameworks H Li, A Ghodsi, M Zaharia, S Shenker, I Stoica Proceedings of the ACM Symposium on Cloud Computing, 1-15, 2014 | 476 | 2014 |
Beyond Data and Model Parallelism for Deep Neural Networks. Z Jia, M Zaharia, A Aiken Proceedings of Machine Learning and Systems 1, 1-13, 2019 | 467 | 2019 |
Efficient large-scale language model training on gpu clusters using megatron-lm D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ... Proceedings of the International Conference for High Performance Computing …, 2021 | 414 | 2021 |
Accelerating the machine learning lifecycle with MLflow. M Zaharia, A Chen, A Davidson, A Ghodsi, SA Hong, A Konwinski, ... IEEE Data Eng. Bull. 41 (4), 39-45, 2018 | 392 | 2018 |
Dawnbench: An end-to-end deep learning benchmark and competition C Coleman, D Narayanan, D Kang, T Zhao, J Zhang, L Nardi, P Bailis, ... Training 100 (101), 102, 2017 | 370 | 2017 |
Vuvuzela: Scalable private messaging resistant to traffic analysis J Van Den Hooff, D Lazar, M Zaharia, N Zeldovich Proceedings of the 25th Symposium on Operating Systems Principles, 137-152, 2015 | 345 | 2015 |
Low-cost communication for rural internet kiosks using mechanical backhaul A Seth, D Kroeker, M Zaharia, S Guo, S Keshav Proceedings of the 12th annual international conference on Mobile computing …, 2006 | 316 | 2006 |
Mlperf training benchmark P Mattson, C Cheng, G Diamos, C Coleman, P Micikevicius, D Patterson, ... Proceedings of Machine Learning and Systems 2, 336-349, 2020 | 306 | 2020 |
Faster and more accurate sequence alignment with SNAP M Zaharia, WJ Bolosky, K Curtis, A Fox, D Patterson, S Shenker, I Stoica, ... arXiv preprint arXiv:1111.5572, 2011 | 283 | 2011 |
Fast and interactive analytics over Hadoop data with Spark M Zaharia, M Chowdhury, T Das, A Dave, J Ma, M Mccauley, M Franklin, ... Usenix Login 37 (4), 45-51, 2012 | 272 | 2012 |
Multi-resource fair queueing for packet processing A Ghodsi, V Sekar, M Zaharia, I Stoica Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies …, 2012 | 269 | 2012 |
ModelDB: a system for machine learning model management M Vartak, H Subramanyam, WE Lee, S Viswanathan, S Husnoo, ... Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 1-3, 2016 | 255 | 2016 |
Colbertv2: Effective and efficient retrieval via lightweight late interaction K Santhanam, O Khattab, J Saad-Falcon, C Potts, M Zaharia arXiv preprint arXiv:2112.01488, 2021 | 253 | 2021 |
Choosy: Max-min fair sharing for datacenter jobs with constraints A Ghodsi, M Zaharia, S Shenker, I Stoica Proceedings of the 8th ACM European Conference on Computer Systems, 365-378, 2013 | 252 | 2013 |
Selection via proxy: Efficient data selection for deep learning C Coleman, C Yeh, S Mussmann, B Mirzasoleiman, P Bailis, P Liang, ... arXiv preprint arXiv:1906.11829, 2019 | 251 | 2019 |
TASO: optimizing deep learning computation with automatic generation of graph substitutions Z Jia, O Padon, J Thomas, T Warszawski, M Zaharia, A Aiken Proceedings of the 27th ACM Symposium on Operating Systems Principles, 47-62, 2019 | 247 | 2019 |
How is ChatGPT's behavior changing over time? L Chen, M Zaharia, J Zou arXiv preprint arXiv:2307.09009, 2023 | 234 | 2023 |
Structured streaming: A declarative api for real-time applications in apache spark M Armbrust, T Das, J Torres, B Yavuz, S Zhu, R Xin, A Ghodsi, I Stoica, ... Proceedings of the 2018 International Conference on Management of Data, 601-613, 2018 | 229 | 2018 |