- Shanghai
- http://boxfishlab.com
Lists (6)
Sort Name ascending (A-Z)
Stars
LW-BenchHub is a unified benchmark hub built on Isaac Lab–Arena for embodied AI, providing consistent interfaces, realistic environments, multi-robot support, and large-scale evaluation. It include…
EverMemOS is an open-source, enterprise-grade intelligent memory system. Our mission is to build AI memory that never forgets, making every conversation built on previous understanding.
Memory infrastructure for LLMs and AI agents
Youtu-GraphRAG boosts cost efficiency, inference accuracy, and cross-domain adaptability, pushing the boundaries of performance in complex QA.
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step." — Andrej Karpathy. A frontier, first-principles handbook inspi…
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Video translation and dubbing tool powered by LLMs. The video translator offers 100 language translations and one-click full-process deployment. The video translation output is optimized for platfo…
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
[SIGGRAPH 2025] LAM: Large Avatar Model for One-shot Animatable Gaussian Head
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
A powerful framework for building realtime voice AI agents 🤖🎙️📹
[SIGGRAPH'24] CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization
Solve Visual Understanding with Reinforced VLMs
[ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Production-ready platform for agentic workflow development.
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone