- Shanghai, China
- http://boyuandeng.me
- @boyuandeng
Lists (8)
Sort Name ascending (A-Z)
Stars
Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"
A 13B large language model developed by Baichuan Intelligent Technology
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
Improved build system generator for CPython C, C++, Cython and Fortran extensions
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
A high-throughput and memory-efficient inference and serving engine for LLMs
Unsupervised text tokenizer for Neural Network-based text generation.
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
the blessed package to manage your versions by scm tags
⚡ Langchain apps in production using Jina & FastAPI
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Explore and understand your training and validation data.
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
PromptCBLUE: a large-scale instruction-tuning dataset for multi-task and few-shot learning in the medical domain in Chinese
Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)
HuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.
A next generation Python CMake adaptor and Python API for plugins
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Class notes for the course "Long Term Memory in AI - Vector Search and Databases" COS 597A @ Princeton Fall 2023
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers
A guidance language for controlling large language models.