Skip to content
View sijial430's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report sijial430

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. HALOs HALOs Public

    Forked from ContextualAI/HALOs

    A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

    Python

  2. ContextualAI/HALOs ContextualAI/HALOs Public

    A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

    Python 892 49

  3. trl-fork trl-fork Public

    Forked from Muennighoff/trl-fork

    Train transformer language models with reinforcement learning.

    Python

  4. verl-fork verl-fork Public

    Forked from volcengine/verl

    verl: Volcano Engine Reinforcement Learning for LLMs

    Python