Skip to content
View kiminh's full-sized avatar

Block or report kiminh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Reproduce R1 Zero on Logic Puzzle

Python 2,423 165 Updated Mar 20, 2025

Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation

Python 1 Updated Dec 26, 2025

Step-DeepResearch

Python 250 7 Updated Dec 25, 2025

A Real-Time Fault-tolerant In-Memory Distributed Message Queue

Java 9 2 Updated Jun 25, 2017

LlamBERT implements a hybrid approach approach for text classification that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer …

Python 23 6 Updated Nov 2, 2024

EFFICIENT AND OPTIMIZED TOKENIZER ENGINE FOR LLM INFERENCE SERVING

C++ 474 7 Updated Sep 19, 2025

Official Repository for "See, Rank and Filter: Important Word-Aware Clip Filtering via Scene Understanding for Moment Retrieval and Highlight Detection" (AAAI 2026 Oral)

1 Updated Dec 19, 2025

Counter-factual reward ranking

Python 5 Updated Oct 22, 2025

Ambrosia is a Python library for A/B tests design, split and result measurement

Python 239 19 Updated Oct 24, 2023

Effective LLM Alignment Toolkit

Python 151 10 Updated Jun 25, 2025

A lightweight, high-performance microservice for forwarding browser-side logs to server-side log aggregation systems (ELK, Loki, Splunk, etc.).

TypeScript 7 1 Updated Dec 23, 2025

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 1 Updated Dec 23, 2025

Source code for the paper "Fast Offline Policy Optimization for Large Scale Recommendation" published at the Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23).

Python 3 Updated Jun 27, 2023

Materials for the "Reward Optimising Recommendation using Deep Learning and Fast Maximum Inner Product Search" tutorial delivered at the 28th SIGKDD Conference on Knowledge Discovery and Data Minin…

Jupyter Notebook 6 2 Updated Sep 19, 2022

Source code for the paper "Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning" published at NeuRIPS '24.

Python 7 1 Updated Mar 5, 2025
Python 41 5 Updated Dec 19, 2025

A GPipe implementation in PyTorch

Python 860 98 Updated Jul 25, 2024

Context-Adaptive and Consistency-Aware Multi-Modal Outfit Compatibility Modeling

Python 1 Updated Dec 23, 2025

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1 1 Updated Dec 16, 2025

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Python 243 6 Updated Dec 10, 2025

1688 taobao jd image search products

Python 22 9 Updated Feb 10, 2023

Official implementation for "SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation"

46 Updated Dec 1, 2025

Official Implementation of paper: [Nav-R2:Dual‑Relation Reasoning for Generalizable Open‑Vocabulary Object‑Goal Navigation]

Python 7 Updated Dec 10, 2025

Guided Proximal Policy Optimization with Structured Action Graph

Python 3 Updated Nov 2, 2023

Lifelong Generative Recommendation Unlearning via Dual-Process Memory and Hierarchical Preference Alignment

Python 1 Updated Dec 9, 2025

Open-source platform to build and deploy AI agent workflows.

TypeScript 24,580 3,064 Updated Dec 28, 2025

DreamPRM tackles the dataset quality imbalance and distribution shift that plague multimodal PRM training by domain-reweighting.

Python 19 3 Updated Sep 6, 2025

本项目旨在提供一个微调酒店推荐垂直领域大模型并应用的完整闭环案例作为大家的参考案例。本项目使用的基础大模型为Qwen2.5-7B-Instruct。项目特色:完整的垂直应用案例闭环、项目源码剖析开源共享、详实的图文指导手册、手把手全流程实操演示视频

Python 74 20 Updated Apr 23, 2025
Next