Is there an existing issue for this?
Current Behavior

Expected Behavior
No response
Steps To Reproduce
使用AutoTokenizer、AutoModel加载微调好的多轮对话checkpoint,使用stream_chat预测
Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Anything else?
No response