Skip to content

Sorosliu1029/Papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 

Repository files navigation

Papers

Name Finish Date
Writing reviews for systems conferences TODO
Efficiently compiling efficient query plans for modern hardware 2025-12-08
Encapsulation of parallelism in the Volcano query processing system 2025-12-08
Parallel Database Systems: The Future of High Performance Database Processing 2025-12-08
The Case for Learned Index Structures 2025-12-08
C-store: a column-oriented DBMS 2025-12-08
Vectorwise: Beyond Column Stores 2025-12-07
R-trees: a dynamic index structure for spatial searching 2025-12-07
The Bw-Tree: A B-tree for new hardware platforms 2025-12-04
The Snowflake Elastic Data Warehouse 2025-11-19
A comparison of approaches to large-scale data analysis 2025-11-17
SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference 2025-11-12
Flame: Simplifying Topology Extension in Federated Learning 2025-11-11
Bigtable: A Distributed Storage System for Structured Data 2025-11-11
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models 2025-11-09
DεpS: Delayed ε-Shrinking for Faster Once-For-All Training 2025-11-09
Debunking the CUDA Myth Towards GPU-based AI Systems: Evaluation of the Performance and Programmability of Intel's Gaudi NPU for AI Model Serving 2025-11-07
SqueezeLLM: Dense-and-Sparse Quantization 2025-11-05
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks 2025-11-04
Fast Inference from Transformers via Speculative Decoding 2025-10-28
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness 2025-10-28
Cartridges: Lightweight and general-purpose long context representations via self-study 2025-10-26
UGPU: Dynamically Constructing Unbalanced GPUs for Enhanced Resource Efficiency 2025-10-25
Medha: Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations 2025-10-24
A Berkeley View of Systems Challenges for AI 2025-10-14
Hidden Technical Debt in Machine Learning Systems 2025-10-14
DeepSeek-V3 Technical Report 2025-10-12
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving 2025-10-06
Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve 2025-10-05
Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes 2025-10-04
Orca: A Distributed Serving System for Transformer-Based Generative Models 2025-10-03
Loki: A System for Serving ML Inference Pipelines with Hardware and Accuracy Scaling 2025-10-03
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads 2025-09-29
TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters 2025-09-28
MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters 2025-09-28
An Empirical Evaluation of Columnar Storage Formats 2025-09-27
A variable warp size architecture 2025-09-27
Gandiva: Introspective Cluster Scheduling for Deep Learning 2025-09-23
SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads 2025-09-21
INFaaS: Automated Model-less Inference Serving 2025-09-21
Scalable GPU graph traversal 2025-09-21
InferLine: ML Prediction Pipeline Provisioning and Management for Tight Latency Objectives 2025-09-16
Clipper: A Low-Latency Online Prediction Serving System 2025-09-16
Varuna: Scalable, Low-cost Training of Massive Deep Learning Models 2025-09-14
Accelerating Large Graph Algorithms on the GPU Using CUDA 2025-09-14
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning 2025-09-14
ModServe: Scalable and Resource-Efficient Large Multimodal Model Serving 2025-09-12
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM 2025-09-09
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models 2025-09-09
PyTorch Distributed: Experiences on Accelerating Data Parallel Training 2025-09-07
Scaling Laws for Neural Language Models 2025-09-07
Optimization Techniques for GPU Programming 2025-09-06
PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation 2025-09-01
Triton: an intermediate language and compiler for tiled neural network computations 2025-08-31
How to Read a Computer Science Research Paper 2025-08-30
How to Read a Paper 2025-08-30
Analyzing Modern NVIDIA GPU cores 2025-08-29
PyTorch: An Imperative Style, High-Performance Deep Learning Library 2025-08-25
TensorFlow: A system for large-scale machine learning 2025-08-24
A Few Useful Things to Know About Machine Learning 2025-08-22
What Goes Around Comes Around… And Around… 2025-08-20
Kafka: a Distributed Messaging System for Log Processing 2023-02-26
Blockstack: A Global Naming and Storage System Secured by Blockchains 2022-08-13
Bitcoin: A Peer-to-Peer Electronic Cash System 2022-08-10
Secure Untrusted Data Repository (SUNDR) 2022-08-05
Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS 2022-08-04
Scaling Memcache at Facebook 2022-07-17
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing 2022-07-16
No compromises: distributed transactions with consistency, availability, and performance 2022-07-14
Spanner: Google’s Globally-Distributed Database 2022-07-11
Frangipani: A Scalable Distributed File System 2022-07-10
Chain Replication for Supporting High Throughput and Availability 2022-06-29
ZooKeeper: Wait-free coordination for Internet-scale systems 2022-06-27
In Search of an Understandable Consensus Algorithm (Extended Version) 2022-06-19
The Go Programming Language and Environment 2022-06-06
The Design of a Practical System for Fault-Tolerant Virtual Machines 2022-06-05
The Google File System 2022-06-03
MapReduce: Simplified Data Processing on Large Clusters 2022-05-28
The Evolution of the Unix Time-sharing System 2022-05-25
The UNIX Time-Sharing System 2022-05-24
RCU Usage In the Linux Kernel: One Decade Later 2022-05-12
Meltdown: Reading Kernel Memory from User Space 2022-05-11
Eliminating Receive Livelock in an Interrupt-driven Kernel 2022-05-08
The benefits and costs of writing a POSIX kernel in a high-level language 2022-05-07
Dune: Safe User-level Access to Privileged CPU Features 2022-05-03
The Performance of micro-Kernel-Based Systems 2022-05-02
Virtual Memory Primitives for User Programs 2022-04-30
Journaling the Linux ext2fs Filesystem 2022-04-24