Stars
Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok et al. 2025).
Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Moalla et al. 2024). Uses TorchRL and provides extensive tool…
A template for starting reproducible Python machine-learning projects with hardware acceleration. Find an example at https://github.com/CLAIRE-Labo/no-representation-no-trust
Анализ переноса обучения в задаче определения тональности
Clean minimalist implementations of popular competitive programming algorithms