-
POSTECH
- 77 Cheongam-Ro. Nam-Gu. Pohang. Gyeongbuk. Korea 37673
- https://postechcsreference.notion.site/POSTECH-CSE-Course-Wiki-20aa3170f7f04e95b242e0aabb2a445b?pvs=74
- https://www.resume.lol/share/vi03idj
- in/kim-jaejin
Lists (12)
Sort Name ascending (A-Z)
Stars
A library of long-horizon Task-and-Motion-Planning (TAMP) problems in kitchen and household scenes, as well as planners to solve them
StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold (NeurIPS 2025 Spotlight)
[NeurIPS 2025] Official Implementation of ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, MySQL, Chromium, Redis and WebKit/Safari
Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".
A Collection of Papers on Diffusion Language Models
Continuous Thought Machines, because thought takes time and reasoning is a process.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Code for ICLR 2025 paper "Emergence of a High-Dimensional Abstraction Phase in Language Transformers"
Training Large Language Model to Reason in a Continuous Latent Space
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
[SIGMOD' 25] A fast parallel kd-tree implementation
Cost-efficient and pluggable Infrastructure components for GenAI inference
My learning notes/codes for ML SYS.
Data Structure Algorithms, (GenAI/ML) System Design, Machine Learning, DevOps coding interview practices
The nnsight package enables interpreting and manipulating the internals of deep learned models.
Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator suppor…
Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.