Skip to content
View suhmily's full-sized avatar

Block or report suhmily

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).

Python 387 50 Updated Apr 12, 2024

Dermatology ddx dataset, Jax implementations of Monte Carlo conformal prediction, plausibility regions and statistical annotation aggregation from our recent work on uncertain ground truth (TMLR'23…

Python 674 50 Updated Mar 28, 2024

Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)

Python 218 23 Updated Feb 21, 2023

Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models)

Python 116 11 Updated Sep 13, 2024

This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

367 33 Updated Nov 25, 2023

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

Python 8,679 670 Updated Aug 13, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,928 596 Updated Jul 4, 2025

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 3,796 688 Updated Oct 11, 2025

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,945 1,877 Updated Jul 15, 2025

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,164 338 Updated May 7, 2025

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!

Python 788 69 Updated Mar 17, 2025

Finetune Llama-3-8b on the MathInstruct dataset

Python 113 26 Updated Oct 17, 2024

A family of compressed models obtained via pruning and knowledge distillation

355 18 Updated Nov 6, 2025
Jupyter Notebook 467 34 Updated Jul 22, 2024

[ACL 2024] The project of Symbol-LLM

Python 59 4 Updated Jul 10, 2024

PaL: Program-Aided Language Models (ICML 2023)

Python 515 64 Updated Jun 30, 2023

Mix of Minimal Optimal Sets (MMOS) of dataset has two advantages for two aspects, higher performance and lower construction costs on math reasoning.

Python 74 3 Updated Jul 27, 2024

ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].

Python 1,105 78 Updated Feb 22, 2024

Train transformer language models with reinforcement learning.

Python 16,292 2,291 Updated Nov 14, 2025

[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.

3,056 203 Updated Nov 10, 2025

代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota

Python 44 10 Updated Jul 25, 2024

Lightweight and portable LLM sandbox runtime (code interpreter) Python library.

Python 616 56 Updated Nov 14, 2025

Parse LaTeX math expressions

Python 140 31 Updated Aug 5, 2024

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,460 180 Updated Nov 9, 2025

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 284 29 Updated May 26, 2024

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,092 733 Updated Nov 13, 2025

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 386 16 Updated Jan 19, 2025

Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks.This framework enables Claud…

Python 11,137 1,164 Updated Dec 12, 2024
Next