Skip to content

Implement C++ Inference Demo for MiniCPM4 in LLM-TPU #246

@HarmonyHu

Description

@HarmonyHu

Description:
We currently provide a Python demo for MiniCPM4 in the LLM-TPU project:

LLM-TPU repository: https://github.com/sophgo/LLM-TPU/tree/main
MiniCPM4 model demo (Python): https://github.com/sophgo/LLM-TPU/tree/main/models/MiniCPM4

Task:
Implement a C++ version of the MiniCPM4 chat/demo that achieves functionality and output quality comparable to the existing Python demo. There is an example in Qwen3, you can also refer to it.

Requirements:
Ensure generated responses are similar in style and quality to the Python demo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions